Skip to content

Commit 5e7205f

Browse files
committed
feat: MCP server
1 parent 9e354ce commit 5e7205f

48 files changed

Lines changed: 32567 additions & 151 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
*.dll
55
*.so
66
*.dylib
7+
.DS_Store
78

89
# Test binary, built with `go test -c`
910
*.test
@@ -75,3 +76,5 @@ src/lang/testdata
7576

7677
tools
7778
abcoder
79+
80+
!testdata/asts/*.json

README.md

Lines changed: 47 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,8 @@ ABCoder, an general AI-oriented code-processing SDK, is designed to enhance codi
1313
- General Parser, parses abitary-language codes to UniAST.
1414

1515
- General Writer, transforms UniAST back to codes.
16-
17-
- (Comming Soon) General Iterator, a framework for visiting the UniAST and implementing code-batch-processing workflows.
1816

19-
- (Comming Soon) Code Retrieval-Augmented-Generation (RAG), provides a set of tools and functions to help the LLM understand your codes much deeper than ever.
17+
- Code-Retrieval-Augmented-Generation (Code-RAG), provides a set of MCP tools to help the LLM understand your codes more precisely.
2018

2119
Based on these features, developers can easily implement or enhance their AI-assisted-programming applications, such as reviewing, optimizing, translating, etc.
2220

@@ -26,21 +24,53 @@ Based on these features, developers can easily implement or enhance their AI-ass
2624
see [UniAST Specification](docs/uniast-zh.md)
2725

2826

29-
# Getting Started
27+
# Quick Start
28+
29+
Below is a quick start guide for using ABCoder to build a coding context on both internal and external libraies.
3030

3131
1. Install ABCoder:
32-
```bash
33-
go install github.com/cloudwego/abcoder@latest
34-
```
32+
33+
```bash
34+
go install github.com/cloudwego/abcoder@latest
35+
```
36+
3537
2. Use ABCoder to parse a repository to UniAST (JSON)
36-
```bash
37-
abcoder parse {language} {repo-path} > ast.json
38-
```
39-
3. Do your magic with UniAST...
40-
4. Use ABCoder to write an UniAST back to codes
41-
```bash
42-
abcoder write {language} ast.json
43-
```
38+
39+
```bash
40+
abcoder parse {language} {repo-path} > xxx.json
41+
```
42+
43+
for example:
44+
45+
```bash
46+
git clone https://github.com/cloudwego/localsession.git localsession
47+
abcoder parse go localsession -o /abcoder-asts/localsession.json
48+
```
49+
50+
3. Integrate ABCoder's MCP tools into your AI agent.
51+
52+
```json
53+
{
54+
"mcpServers": {
55+
"abcoder": {
56+
"command": "abcoder",
57+
"args": [
58+
"mcp",
59+
"{the-AST-directory}" // EX: "/abcoder-asts"
60+
]
61+
}
62+
}
63+
}
64+
```
65+
66+
67+
4. Enjoy it!
68+
69+
See [using ABCoder in TRAE](https://bytedance.sg.larkoffice.com/file/SEmdbLpC1oCbclxmc5Dlkp9fg7r). Tips:
70+
71+
- You can add more repo ASTs into the AST directory without restarting abcoder MCP server.
72+
73+
- Try to use [the recommaned prompt](llm/prompt/analyzer.md) and combine planning/memory tools like [sequential-thinking](https://github.com/modelcontextprotocol/servers/tree/main/src/sequentialthinking) in your AI agent.
4474
4575
4676
# Supported Languages
@@ -51,9 +81,8 @@ ABCoder currently supports the following languages:
5181
| -------- | ----------- | ----------- |
5282
| Go | ✅ | ✅ |
5383
| Rust | ✅ | Coming Soon |
54-
| C | Coming Soon ||
55-
| Python | Coming Soon ||
56-
84+
| C | ✅ | ❌ |
85+
| Python | Coming Soon | Coming Soon |
5786
5887
5988
# Getting Involved

docs/parser-en.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# ABCoder - Language Parser Introduction
2+
3+
ABCoder currently implements Parser based on the [LSP](https://microsoft.github.io/language-server-protocol/) protocol to achieve precise dependency collection and facilitate future multi-language extensions.
4+
5+
## Code Structure
6+
7+
Located under the [lang](/lang) package, including:
8+
9+
- uniast: Golang definitions for unified AST structure
10+
- lsp: LSP protocol processing client, providing interfaces for file parsing, reference lookup, syntax tree parsing, definition lookup, etc., as well as the **generic language specification LanguageSpec interface**
11+
- collect: Responsible for LSP symbol collection and UniAST export, which is the core computation logic
12+
- {language}: Mainly implements the corresponding {language} specification for the lsp#Spec interface. Also includes some specific calling logic for LSP servers
13+
14+
## Operation Process
15+
16+
![lang-parser](../images/lang-parser.png)
17+
18+
1. Identify the language through command line parameters to start the corresponding LSP server and pass initialization parameters
19+
2. Traverse repository files, call the `textDocument/documentSymbol` method to get all symbols for each file. For each symbol:
20+
1. Call the `textDocument/semanticTokens/range` method to get tokens in the symbol code
21+
2. Identify valid entity tokens, call `textDocument/definition` to jump to the corresponding symbol location, thus establishing node dependency relationships
22+
3. Repeat step 2 until file processing is complete. Finally convert the collected LSP symbols to UniAST format and output
23+
24+
## Extending Other Language Implementations
25+
26+
Since UniAST is not completely equivalent to LSP, some language-specific behavior interfaces need to be implemented for conversion. Refer to the lang/rust package, generally the following capabilities need to be implemented:
27+
28+
- GetDefaultLSP(): Map user input language to specific lsp.Language and corresponding LSP name
29+
- CheckRepo(): Check user repository status, handle toolchain issues according to language specifications, and return the first file to open by default (for triggering LSP server) and the waiting time for server initialization (determined by repository size)
30+
- **LanguageSpec interface**: Core module for handling non-LSP generic syntax information, such as determining if a token is a standard library symbol, function signature parsing, etc.
31+
- ModulePatcher: Post-processing module for handling language-specific information collection. For example, rust's use symbol collection (not collected by LSP). Can be left unimplemented
32+
33+
### LanguageSpec
34+
35+
```go
36+
// Detailed implementation used for collect LSP symbols and transform them to UniAST
37+
type LanguageSpec interface {
38+
// initialize a root workspace, and return all modules [modulename=>abs-path] inside
39+
WorkSpace(root string) (map[string]string, error)
40+
41+
// give an absolute file path and returns its module name and package path
42+
// external path should alse be supported
43+
// FIXEM: some language (like rust) may have sub-mods inside a file, but we still consider it as a unity mod here
44+
NameSpace(path string) (string, string, error)
45+
46+
// tells if a file belang to language AST
47+
ShouldSkip(path string) bool
48+
49+
// FileImports parse file codes to get its imports
50+
FileImports(content []byte) ([]uniast.Import, error)
51+
52+
// return the first declaration token of a symbol, as Type-Name
53+
DeclareTokenOfSymbol(sym DocumentSymbol) int
54+
55+
// tells if a token is an AST entity
56+
IsEntityToken(tok Token) bool
57+
58+
// tells if a token is a std token
59+
IsStdToken(tok Token) bool
60+
61+
// return the SymbolKind of a token
62+
TokenKind(tok Token) SymbolKind
63+
64+
// tells if a symbol is a main function
65+
IsMainFunction(sym DocumentSymbol) bool
66+
67+
// tells if a symbol is a language symbol (func, type, variable, etc) in workspace
68+
IsEntitySymbol(sym DocumentSymbol) bool
69+
70+
// tells if a symbol is public in workspace
71+
IsPublicSymbol(sym DocumentSymbol) bool
72+
73+
// declare if the language has impl symbol
74+
// if it return true, the ImplSymbol() will be called
75+
HasImplSymbol() bool
76+
// if a symbol is an impl symbol, return the token index of interface type, receiver type and first-method start (-1 means not found)
77+
// ortherwise the collector will use FunctionSymbol() as receiver type token index (-1 means not found)
78+
ImplSymbol(sym DocumentSymbol) (int, int, int)
79+
80+
// if a symbol is a Function or Method symbol, return the token index of Receiver (-1 means not found),TypeParameters, InputParameters and Outputs
81+
FunctionSymbol(sym DocumentSymbol) (int, []int, []int, []int)
82+
}
83+
```
84+
85+
- Rust-parser implementation location: [RustSpec](/lang/rust/spec.go)
86+
87+
### ModulePatcher
88+
89+
```go
90+
// ModulePatcher supplements some information for module
91+
type ModulePatcher interface {
92+
// Patch is called after collect all symbols
93+
Patch(ast *parse.Module)
94+
}
95+
```
96+
97+
- Rust-parser implementation: [RustModulePatcher](/lang/rust/patch.go)

docs/parser-zh.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,9 @@ type LanguageSpec interface {
5050
// tells if a file belang to language AST
5151
ShouldSkip(path string) bool
5252
53+
// FileImports parse file codes to get its imports
54+
FileImports(content []byte) ([]uniast.Import, error)
55+
5356
// return the first declaration token of a symbol, as Type-Name
5457
DeclareTokenOfSymbol(sym DocumentSymbol) int
5558

0 commit comments

Comments
 (0)