Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 50 additions & 1 deletion adk/backend/local/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ go get github.com/cloudwego/eino-ext/adk/backend/local
import (
"context"
"github.com/cloudwego/eino-ext/adk/backend/local"
"github.com/cloudwego/eino/adk/middlewares/filesystem"
"github.com/cloudwego/eino/adk/filesystem"
)

// Initialize backend
Expand All @@ -42,6 +42,7 @@ content, err := backend.Read(ctx, &filesystem.ReadRequest{
- **Zero Configuration** - Works out of the box with no setup required
- **Direct Filesystem Access** - Operates on local files with native performance
- **Full Backend Implementation** - Supports all `filesystem.Backend` operations
- **MultiModal Read** - Returns structured image/PDF content for multimodal models; non-image/PDF files fall back to plain text `Read`
- **Path Security** - Enforces absolute paths to prevent directory traversal
- **Safe Write** - Prevents accidental file overwrites by default

Expand Down Expand Up @@ -92,11 +93,59 @@ See the following examples for more usage:

### Additional Methods

- **`MultiModalRead(ctx, req)`** - Read file as multimodal content (images / PDFs). Non-image/PDF files delegate to `Read`.
- **`Execute(ctx, req)`** - Execute shell command (requires validation)
- **`ExecuteStreaming(ctx, req)`** - Execute with streaming output

**Note:** All paths must be absolute. Use `filepath.Abs()` to convert relative paths.

## MultiModalRead

`MultiModalRead` returns structured parts suitable for multimodal model input.

Supported file types:

- **Images**: `.jpg` / `.jpeg` / `.png` / `.gif` / `.bmp` / `.webp` / `.tiff` / `.tif` — returned as an `image` part with detected MIME type.
- **PDF**:
- Without `Pages`: the full PDF is returned as a `pdf` part.
- With `Pages` (e.g. `"1"`, `"1-5"`): the specified page range is rendered to PNG (150 DPI) and returned as `image` parts.
- **Other files**: fall back to `Read`, returned via `MultiFileContent.FileContent`.

Size and page limits:

| Scenario | Limit |
| ---------------- | ------------------ |
| Image | 10 MB |
| PDF (full read) | 20 MB |
| PDF (paged read) | 100 MB, 20 pages per request |

Files exceeding the limit are rejected up-front based on `os.Stat` size, and the returned error message includes the actual size and limit. For oversize PDFs, the error message suggests using `Pages` to switch to paged reading.

### PDF Rendering Dependency

PDF page rendering is provided by [`go-fitz`](https://github.com/gen2brain/go-fitz), which uses MuPDF via `purego`/FFI (no classic CGO). The native library must be installed on the build/run machine:

- macOS: `brew install mupdf`
- Ubuntu / Debian: `apt-get install -y libmupdf-dev`
- CentOS / RHEL: `yum install -y mupdf-devel`

### Example

```go
res, err := backend.MultiModalRead(ctx, &filesystem.MultiModalReadRequest{
ReadRequest: filesystem.ReadRequest{FilePath: "/path/to/page.pdf"},
Pages: "1-3",
})
if err != nil {
// handle error
}
for _, part := range res.Parts {
// part.Type: "image" | "pdf"
// part.MIMEType: "image/png", "application/pdf", ...
// part.Data: raw bytes
}
```

## Security

### Best Practices
Expand Down
51 changes: 50 additions & 1 deletion adk/backend/local/README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ go get github.com/cloudwego/eino-ext/adk/backend/local
import (
"context"
"github.com/cloudwego/eino-ext/adk/backend/local"
"github.com/cloudwego/eino/adk/middlewares/filesystem"
"github.com/cloudwego/eino/adk/filesystem"
)

// 初始化后端
Expand All @@ -42,6 +42,7 @@ content, err := backend.Read(ctx, &filesystem.ReadRequest{
- **零配置** - 开箱即用,无需设置
- **直接文件系统访问** - 使用本地性能操作本地文件
- **完整后端实现** - 支持所有 `filesystem.Backend` 操作
- **多模态读取** - 为多模态模型返回结构化的图片 / PDF 内容;非图片/PDF 文件自动回退到纯文本 `Read`
- **路径安全** - 强制使用绝对路径以防止目录遍历
- **安全写入** - 默认情况下防止意外覆盖文件

Expand Down Expand Up @@ -92,11 +93,59 @@ backend, _ := local.NewLocalBackend(ctx, &local.Config{

### 其他方法

- **`MultiModalRead(ctx, req)`** - 读取文件为多模态内容(图片 / PDF);非图片/PDF 文件会回退到 `Read`。
- **`Execute(ctx, req)`** - 执行 shell 命令(需要验证)
- **`ExecuteStreaming(ctx, req)`** - 流式输出执行

**注意:** 所有路径必须是绝对路径。使用 `filepath.Abs()` 转换相对路径。

## MultiModalRead

`MultiModalRead` 以适配多模态模型输入的结构化 Parts 返回文件内容。

支持的文件类型:

- **图片**:`.jpg` / `.jpeg` / `.png` / `.gif` / `.bmp` / `.webp` / `.tiff` / `.tif` — 返回 `image` 类型 Part,MIME 通过 magic number 识别。
- **PDF**:
- 不指定 `Pages`:整个 PDF 作为 `pdf` 类型 Part 原样返回。
- 指定 `Pages`(例如 `"1"`、`"1-5"`):按范围渲染为 PNG(150 DPI),作为 `image` Parts 返回。
- **其他文件**:回退到 `Read`,内容通过 `MultiFileContent.FileContent` 返回。

大小与页数限制:

| 场景 | 限制 |
| ---------------- | -------------------------- |
| 图片 | 10 MB |
| PDF(全量读取) | 20 MB |
| PDF(分页读取) | 100 MB,每次最多 20 页 |

超出限制的文件会通过 `os.Stat` 提前拒绝,返回的错误信息会包含实际大小与限制值。PDF 全量读取超限时,错误信息会提示改用 `Pages` 切换到分页读取。

### PDF 渲染依赖

PDF 分页渲染依赖 [`go-fitz`](https://github.com/gen2brain/go-fitz),底层通过 `purego`/FFI 调用 MuPDF(不走传统 CGO)。构建/运行机器需要安装原生库:

- macOS:`brew install mupdf`
- Ubuntu / Debian:`apt-get install -y libmupdf-dev`
- CentOS / RHEL:`yum install -y mupdf-devel`

### 使用示例

```go
res, err := backend.MultiModalRead(ctx, &filesystem.MultiModalReadRequest{
ReadRequest: filesystem.ReadRequest{FilePath: "/path/to/page.pdf"},
Pages: "1-3",
})
if err != nil {
// 处理错误
}
for _, part := range res.Parts {
// part.Type: "image" | "pdf"
// part.MIMEType: "image/png"、"application/pdf" 等
// part.Data: 原始字节
}
```

## 安全

### 最佳实践
Expand Down
47 changes: 46 additions & 1 deletion adk/backend/local/examples/backend/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,13 @@ package main

import (
"context"
"encoding/base64"
"fmt"
"log"
"os"
"path/filepath"

"github.com/cloudwego/eino/adk/middlewares/filesystem"
"github.com/cloudwego/eino/adk/filesystem"

"github.com/cloudwego/eino-ext/adk/backend/local"
)
Expand Down Expand Up @@ -222,6 +223,50 @@ func main() {
}
fmt.Println()

// ========================================
// Example 8: MultiModalRead
// ========================================
fmt.Println("Example 8: MultiModalRead (images/PDFs + text fallback)")
fmt.Println("---------------------------------------------------------")

// 8a) Text fallback: .txt is not an image/PDF, so it falls back to Read.
textRes, err := backend.MultiModalRead(ctx, &filesystem.MultiModalReadRequest{
ReadRequest: filesystem.ReadRequest{FilePath: filePath},
})
if err != nil {
log.Fatalf("✗ Failed MultiModalRead on text: %v", err)
}
fmt.Println("Text file (fallback to Read):")
fmt.Println("─────────────────────────")
if textRes.FileContent != nil {
fmt.Print(textRes.FileContent.Content)
}
fmt.Println("\n─────────────────────────")

// 8b) Image branch: write a 1x1 PNG and read it as multimodal content.
// 8-byte PNG signature + IHDR/IDAT/IEND encoding a 1x1 transparent pixel.
const onePixelPNGBase64 = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNgAAIAAAUAAeImBZsAAAAASUVORK5CYII="
pngBytes, err := base64.StdEncoding.DecodeString(onePixelPNGBase64)
if err != nil {
log.Fatalf("✗ Failed to decode PNG sample: %v", err)
}
pngPath := filepath.Join(tempDir, "pixel.png")
if err := os.WriteFile(pngPath, pngBytes, 0644); err != nil {
log.Fatalf("✗ Failed to write PNG sample: %v", err)
}

imgRes, err := backend.MultiModalRead(ctx, &filesystem.MultiModalReadRequest{
ReadRequest: filesystem.ReadRequest{FilePath: pngPath},
})
if err != nil {
log.Fatalf("✗ Failed MultiModalRead on png: %v", err)
}
fmt.Println("PNG image (multimodal):")
for i, part := range imgRes.Parts {
fmt.Printf(" part %d: type=%s mime=%s bytes=%d\n", i+1, part.Type, part.MIMEType, len(part.Data))
}
fmt.Println()

fmt.Println("========================================")
fmt.Println("✓ All examples completed successfully!")
fmt.Println("========================================")
Expand Down
9 changes: 6 additions & 3 deletions adk/backend/local/go.mod
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
module github.com/cloudwego/eino-ext/adk/backend/local

go 1.18
go 1.23.0

require (
github.com/bmatcuk/doublestar/v4 v4.10.0
github.com/cloudwego/eino v0.8.0
github.com/cloudwego/eino v0.9.0-alpha.5.0.20260421122314-28a9142a0774
github.com/gen2brain/go-fitz v1.24.15
github.com/stretchr/testify v1.11.1
)

Expand All @@ -17,10 +18,12 @@ require (
github.com/cloudwego/base64x v0.1.6 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/dustin/go-humanize v1.0.1 // indirect
github.com/ebitengine/purego v0.8.4 // indirect
github.com/eino-contrib/jsonschema v1.0.3 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/goph/emperror v0.17.2 // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/jupiterrider/ffi v0.5.0 // indirect
github.com/klauspost/cpuid/v2 v2.2.9 // indirect
github.com/kr/pretty v0.2.0 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
Expand All @@ -38,7 +41,7 @@ require (
golang.org/x/arch v0.11.0 // indirect
golang.org/x/crypto v0.32.0 // indirect
golang.org/x/exp v0.0.0-20230713183714-613f0c0eb8a1 // indirect
golang.org/x/sys v0.29.0 // indirect
golang.org/x/sys v0.33.0 // indirect
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)
Loading
Loading