Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 157 additions & 0 deletions .github/workflows/translate.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
name: Auto Translate Docs

on:
workflow_run:
workflows: ["Process Documentation"]
types:
- completed
branches-ignore:
- 'main'
push:
branches-ignore:
- 'main'
paths-ignore:
- '.github/workflows/**'

jobs:
translate:
runs-on: ubuntu-latest
# Only run if the workflow_run event was successful, or if it's a direct push
if: github.event_name == 'push' || github.event.workflow_run.conclusion == 'success'
permissions:
contents: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0 # Fetches all history for git diff
token: ${{ secrets.GITHUB_TOKEN }}
# For workflow_run events, checkout the head of the triggering workflow
ref: ${{ github.event_name == 'workflow_run' && github.event.workflow_run.head_sha || github.sha }}

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.9'

- name: Install dependencies
run: pip install httpx aiofiles python-dotenv

- name: Get changed markdown files
id: changed-files
run: |
# Get the list of newly added files between the current and previous commit
# We filter for .md and .mdx files that are inside the language directories
# Only include added (A) files, skip modified (M) and deleted (D) files

# Determine the commit SHA to use based on event type
if [[ "${{ github.event_name }}" == "workflow_run" ]]; then
current_sha="${{ github.event.workflow_run.head_sha }}"
echo "Using workflow_run head_sha: $current_sha"
else
current_sha="${{ github.sha }}"
echo "Using github.sha: $current_sha"
fi

# Try different approaches to get the diff
if [[ -n "${{ github.event.before }}" && "${{ github.event_name }}" == "push" ]]; then
echo "Using github.event.before: ${{ github.event.before }}"
files=$(git diff --name-status ${{ github.event.before }} $current_sha | grep -E '^A\s+' | cut -f2 | grep -E '^(en|en-us|zh-hans|ja-jp|plugin-dev-en|plugin-dev-zh|plugin-dev-ja|versions)/.*(\.md|\.mdx)$' || true)
else
echo "Using HEAD~1 for comparison"
files=$(git diff --name-status HEAD~1 $current_sha | grep -E '^A\s+' | cut -f2 | grep -E '^(en|en-us|zh-hans|ja-jp|plugin-dev-en|plugin-dev-zh|plugin-dev-ja|versions)/.*(\.md|\.mdx)$' || true)
fi

echo "Detected files (Added only):"
echo "$files"

# Filter out files that don't actually exist
existing_files=""
if [[ -n "$files" ]]; then
while IFS= read -r file; do
if [[ -n "$file" && -f "$file" ]]; then
if [[ -z "$existing_files" ]]; then
existing_files="$file"
else
existing_files="$existing_files"$'\n'"$file"
fi
else
echo "Skipping non-existent file: $file"
fi
done <<< "$files"
fi

echo "Final files to translate:"
echo "$existing_files"

if [[ -z "$existing_files" ]]; then
echo "No new markdown files to translate."
echo "files=" >> $GITHUB_OUTPUT
else
# The script expects absolute paths, but we run it from the root, so relative is fine.
echo "files<<EOF" >> $GITHUB_OUTPUT
echo "$existing_files" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
fi

- name: Run translation script
if: steps.changed-files.outputs.files
env:
DIFY_API_KEY: ${{ secrets.DIFY_API_KEY }}
run: |
echo "Files to translate:"
echo "${{ steps.changed-files.outputs.files }}"

# Create temporary file list
echo "${{ steps.changed-files.outputs.files }}" > /tmp/files_to_translate.txt

# Start all translation processes in parallel
pids=()
while IFS= read -r file; do
if [[ -n "$file" ]]; then
echo "Starting translation for $file..."
python tools/translate/main.py "$file" "$DIFY_API_KEY" &
pids+=($!)
fi
done < /tmp/files_to_translate.txt

# Wait for all background processes to complete
echo "Waiting for ${#pids[@]} translation processes to complete..."
failed=0
for pid in "${pids[@]}"; do
if ! wait "$pid"; then
echo "Translation process $pid failed"
failed=1
fi
done

if [ $failed -eq 1 ]; then
echo "Some translations failed"
exit 1
fi

echo "All translations completed successfully"

- name: Commit and push changes
run: |
git config --global user.name 'github-actions[bot]'
git config --global user.email 'github-actions[bot]@users.noreply.github.com'
# Check if there are any changes to commit
if [[ -n $(git status --porcelain) ]]; then
git add .
git commit -m "docs: auto-translate documentation"
# Push to the appropriate branch based on event type
if [[ "${{ github.event_name }}" == "workflow_run" ]]; then
# For workflow_run events, push to the head branch of the triggering workflow
branch_ref="${{ github.event.workflow_run.head_branch }}"
echo "Pushing to workflow_run head branch: $branch_ref"
git push origin HEAD:$branch_ref
else
# For push events, push to the same branch the workflow was triggered from
echo "Pushing to current branch: ${{ github.ref_name }}"
git push origin HEAD:${{ github.ref_name }}
fi
echo "Translated files have been pushed to the branch."
else
echo "No new translations to commit."
fi
1 change: 1 addition & 0 deletions tools/translate/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
dify_api_key=your_dify_api_key_here
1 change: 1 addition & 0 deletions tools/translate/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.env
125 changes: 125 additions & 0 deletions tools/translate/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Automatic Document Translation

Multi-language document auto-translation system based on GitHub Actions and Dify AI, supporting English, Chinese, and Japanese trilingual translation.

> **Other Languages**: [中文](README.md) | [日本語](README_JA.md)

## How It Works

1. **Trigger Condition**: Automatically runs when pushing to non-main branches
2. **Smart Detection**: Automatically identifies modified `.md/.mdx` files and determines source language
3. **Translation Logic**:
- ✅ Translates new documents to other languages
- ❌ Skips existing translation files (avoids overwriting manual edits)
4. **Auto Commit**: Translation results are automatically pushed to the current branch

## System Features

- 🌐 **Multi-language Support**: Configuration-based language mapping, theoretically supports any language extension
- 📚 **Terminology Consistency**: Built-in professional terminology database, LLM intelligently follows terminology to ensure unified technical vocabulary translation
- 🔄 **Concurrent Processing**: Smart concurrency control, translates multiple target languages simultaneously
- 🛡️ **Fault Tolerance**: 3-retry mechanism with exponential backoff strategy
- ⚡ **Incremental Translation**: Only processes changed files, avoids redundant work
- 🧠 **High-Performance Models**: Uses high-performance LLM models to ensure translation quality

## Usage

### For Document Writers

1. Write/modify documents in any language directory
2. Push to branch (non-main)
3. Wait 0.5-1 minute for automatic translation completion
4. **View Translation Results**:
- Create Pull Request for local viewing and subsequent editing
- Or view Actions push commit details on GitHub to directly review translation quality

### Supported Language Directories

- **General Documentation**: `en/` ↔ `zh-hans/` ↔ `ja-jp/`
- **Plugin Development Documentation**: `plugin-dev-en/` ↔ `plugin-dev-zh/` ↔ `plugin-dev-ja/`

Note: System architecture supports extending more languages, just modify configuration files

## Important Notes

- System only translates new documents, won't overwrite existing translations
- To update existing translations, manually delete target files then retrigger
- Terminology translation follows professional vocabulary in `termbase_i18n.md`, LLM has intelligent terminology recognition capabilities
- Translation quality depends on configured high-performance models, recommend using high-performance base models in Dify Studio

### System Configuration

#### Terminology Database

Edit `tools/translate/termbase_i18n.md` to update professional terminology translation reference table.

#### Translation Model

Visit Dify Studio to adjust translation prompts or change base models.

---

## 🔧 Development and Deployment Configuration

### Local Development Environment

#### 1. Create Virtual Environment

```bash
# Create virtual environment
python -m venv venv

# Activate virtual environment
# macOS/Linux:
source venv/bin/activate
# Windows:
# venv\Scripts\activate
```

#### 2. Install Dependencies

```bash
pip install -r tools/translate/requirements.txt
```

#### 3. Configure API Key

Create `.env` file in `tools/translate/` directory:

```bash
DIFY_API_KEY=your_dify_api_key_here
```

#### 4. Run Translation

```bash
# Interactive mode (recommended for beginners)
python tools/translate/main.py

# Command line mode (specify file)
python tools/translate/main.py path/to/file.mdx [DIFY_API_KEY]
```

> **Tip**: Right-click in IDE and select "Copy Relative Path" to use as parameter

### Deploy to Other Repositories

1. **Copy Files**:
- `.github/workflows/translate.yml`
- `tools/translate/` entire directory

2. **Configure GitHub Secrets**:
- Repository Settings → Secrets and variables → Actions
- Add `DIFY_API_KEY` secret

3. **Test**: Modify documents in branch to verify automatic translation functionality

### Technical Details

- Concurrent translation limited to 2 tasks to avoid excessive API pressure
- Supports `.md` and `.mdx` file formats
- Based on Dify API workflow mode

## TODO

- [ ] Support updating existing translations
Loading