Skip to content

Commit cbc1a95

Browse files
earayuclaude
andcommitted
docs(indexing): perf audit v2 — K8s prod / pgbouncer / admin UI
per earayu2 thread directives: - msg=caf5c760 / msg=4e9c909c: K8s 走 prod, docker-compose 仅 e2e - msg=e6e4d366 / msg=2f9b062f: PG 用 KubeBlocks + pgbouncer (transaction pooling) - msg=99c1d23a: 新可配参数考虑接入 admin UI 新增章节: - §9 K8s prod 部署参数: resources requests/limits 3 tier 表 / HPA + KEDA queue depth triggers / leader-election 边界 (P1-Helm-3 Redis SETNX lease) / PVC 配置 / OBJECT_STORE multi-replica enforcement / PodDisruptionBudget / 监控告警 (process_resident_memory / queue depth / pg_stat_activity / vector store latency p99) - §10 PG + KubeBlocks + pgbouncer: pooling mode 兼容性 audit checklist (prepared statements / SET LOCAL / advisory lock 全 ✅) / pgbouncer.ini 推荐参数 (pool_mode=transaction, max_client_conn=500, default_pool_size=25) / KubeBlocks PG values 配套 / ApeRAG 侧改造 (pool 30 + pgbouncer 25 server / 4 replica = 120 client) / Helm 模板 P1-Helm-6 / 验证流程 - §11 admin UI 可配化清单: 类 A runtime perf (14 项强烈建议接入 IndexingSettings 卡片) / 类 B collection-level (5 lane on/off + graph extractor concurrency) / 类 C infra ops (db pool / pgbouncer / Helm 资源 — 部署期参数不入 admin UI) / P2-Admin-1 IndexingSettings 卡片 wireframe / P2-Admin-2 backend changes (env > DB settings 优先级) / hook 给 @dongdong 前端接入 §12-§16 待 @ziang 补充: 读路径 / cleanup / 端到端归因 / KubeBlocks 研究 / 联合验收 main HEAD pin: eb4c4f3 (2026-04-30 18:46) PR: #1954 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 92b9449 commit cbc1a95

1 file changed

Lines changed: 313 additions & 7 deletions

File tree

docs/zh-CN/architecture/indexing-perf-audit-v1.md

Lines changed: 313 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -394,17 +394,323 @@ podDisruptionBudget:
394394

395395
---
396396

397-
## 10. 待 @ziang 补充章节
397+
## 9. K8s 生产部署参数(per earayu2 msg=caf5c760 / msg=4e9c909c — K8s 走 prod, docker-compose 仅 e2e)
398398

399-
- §11 读路径 + GraphVectors/chunks.jsonl 复用边界(重复 IO / 重复派生 / cache key 失效)
400-
- §12 cleanup / deletion / reconciler(批删 SQL / 失败重试 / stale reclaim)
401-
- §13 长文档/大量文档端到端瓶颈排序(parser / artifact / DB / queue / LLM / graph store 分层归因)
402-
- §14 联合验收 checklist(合并 §6 实施切片 + ziang 补充项)
399+
### 9.1 ApeRAG 部署形态确认
400+
401+
| 部署目标 | 用途 | 当前 default |
402+
|---|---|---|
403+
| **Kubernetes (KubeBlocks for DB)** | **生产** | Helm chart `deploy/aperag/` |
404+
| docker-compose | e2e / 本地开发 / SOHO 单机 | `docker-compose.yml` |
405+
406+
K8s prod 的所有 perf-relevant 参数都在 `deploy/aperag/values.yaml`,本节按层列出 default + tier 1/2/3 推荐。
407+
408+
### 9.2 资源 requests/limits
409+
410+
| 组件 | 当前 | Tier 2 (中型 1K-10K docs) | Tier 3 (大型 10K+ / 长文) |
411+
|---|---|---|---|
412+
| `api` | `resources: {}` (Best-Effort QoS) | `requests: cpu=500m mem=1Gi` `limits: cpu=2 mem=4Gi` | `requests: cpu=1 mem=2Gi` `limits: cpu=4 mem=8Gi` |
413+
| `indexingWorker` | `resources: {}` | `requests: cpu=2 mem=4Gi` `limits: cpu=4 mem=8Gi` | `requests: cpu=4 mem=8Gi` `limits: cpu=8 mem=16Gi` |
414+
| `frontend` | `resources: {}` | `requests: cpu=100m mem=256Mi` `limits: cpu=500m mem=512Mi` | 同 Tier 2 |
415+
416+
**P0-Helm-1**:values.yaml 增加 `resources.requests` 默认(Tier 2 sane defaults)。理由:
417+
- `resources: {}` 在 K8s 是 BestEffort QoS,OOM 时第一个被 kill,且没有 CPU 保留;生产环境直接死。
418+
- 默认值不写死成 limits(避免 throttle 引起神奇 latency),只写 requests 保证调度命中合适节点。
419+
420+
### 9.3 副本数 + HPA
421+
422+
**当前**:`api.replicaCount=1`, `indexingWorker.replicaCount=1`,没有 HPA。
423+
424+
**P1-Helm-2** HPA 模板(per `q:parse` + `q:indexing:*` queue depth via KEDA):
425+
426+
```yaml
427+
# values.yaml 新增
428+
hpa:
429+
api:
430+
enabled: false # default off — 用户决定是否打开
431+
minReplicas: 2
432+
maxReplicas: 10
433+
targetCPUUtilizationPercentage: 70
434+
indexingWorker:
435+
enabled: false
436+
minReplicas: 2
437+
maxReplicas: 8
438+
# KEDA-based scaling on Redis queue depth
439+
keda:
440+
enabled: false
441+
triggers:
442+
- type: redis
443+
metadata:
444+
address: "{redis-host}:6379"
445+
listName: "q:parse"
446+
listLength: "10" # scale up if >10 items pending
447+
- type: redis
448+
metadata:
449+
listName: "q:indexing:vector"
450+
listLength: "100"
451+
```
452+
453+
**leader-election 边界**(开多 replica 时必须确认):
454+
- `run_reconcile_loop` 在每 pod 都跑会重复扫表(每 30s 都扫 PENDING + FAILED + RUNNING + stuck + collection_regen + graph_vectors_enqueue)
455+
- `run_cleanup_loop` 在每 pod 都跑会重复扫 orphan
456+
- 短期:`indexingWorker.replicaCount=1` 时无问题;扩 replica 前必须加 leader-election(Redis Lua SETNX + lease),否则 reconciler 会重复 push(虽然 idempotent,但浪费 Redis IOPS + DB 扫描成本)
457+
458+
**P1-Helm-3** Leader-election 简易实现:
459+
- 启动时每 pod 用 `redis SETNX indexing:leader:<lane> <pod_name> EX 60`,赢的 pod 跑 reconciler / cleanup
460+
- 每 20s renew lease(`SET ... XX EX 60`);丢失 lease 立即停 reconciler / cleanup loop
461+
- worker lane(vector/fulltext/graph_*/summary/vision/parse/graph_curation)多 replica 安全(Redis BLPOP 互斥)— 不需要 leader-election
462+
463+
### 9.4 PVC / 持久化
464+
465+
| 卷 | 用途 | Tier 2 | Tier 3 |
466+
|---|---|---|---|
467+
| `api-data` (`/data/aperag`) | API 临时缓存 | 10 GiB | 50 GiB |
468+
| `indexingWorker-objects` (`/data/objects`) | parser 派生 artifact + 用户上传源 | 100 GiB | 1 TiB(或切 S3/MinIO)|
469+
| postgres-data (KubeBlocks 管) | 主 DB + DocumentIndex + pgvector | 50 GiB | 200 GiB |
470+
| qdrant-data (KubeBlocks 管) | 向量库 | 50 GiB | 500 GiB |
471+
| es-data (KubeBlocks 管) | 全文索引 | 30 GiB | 200 GiB |
472+
| redis-data (KubeBlocks 管) | 队列 + 缓存 + quota | 5 GiB | 20 GiB |
473+
474+
**P1-Helm-4**:`OBJECT_STORE_TYPE=local` 在 multi-replica 部署下不能用(每 pod 独立盘看不到对方写的 artifact)。Tier 2+ 必须切 `OBJECT_STORE_TYPE=s3`(含 MinIO)。values.yaml 加 enforcement:`indexingWorker.replicaCount > 1 && OBJECT_STORE_TYPE == "local"` → 启动 Helm template error。
475+
476+
### 9.5 PodDisruptionBudget + 滚动升级
477+
478+
**P2-Helm-5**:
479+
```yaml
480+
podDisruptionBudget:
481+
api:
482+
enabled: true
483+
minAvailable: 1
484+
indexingWorker:
485+
enabled: true
486+
minAvailable: 1
487+
```
488+
489+
`indexing-worker-deployment.yaml` 已经用 `livenessProbe.exec: pgrep -f aperag.cli.indexing_worker` + 25s graceful drain(per task #17 cuiwenbo msg=f7868d2c),rolling update 时不会丢 in-flight task。
490+
491+
### 9.6 监控 + 告警(K8s prod 必备)
492+
493+
- `process_resident_memory_bytes` per-pod,p95 报警 > 80% requests
494+
- `q:parse` / `q:indexing:*` queue depth(KEDA / Prometheus)
495+
- `document_index` 表 size + dead tuple ratio
496+
- `pg_stat_activity` 监控 worker DB 连接占用 vs pool budget
497+
- pgvector / Qdrant write latency p99
498+
- 每个 modality `derive` + `sync` duration p50/p95/p99(已有 OTLP 但默认 emitter=noop,prod 必须切 otlp)
499+
500+
---
501+
502+
## 10. PG + KubeBlocks + pgbouncer(per earayu2 msg=e6e4d366 / msg=2f9b062f)
503+
504+
### 10.1 当前现况
505+
506+
- `deploy/databases/postgresql/values.yaml` 走 KubeBlocks 部署 PG cluster(`pg-cluster-postgresql-postgresql` SVC)
507+
- 没有 pgbouncer(ApeRAG 直连 PG)
508+
- API pod `dbPoolSize=5/dbMaxOverflow=5`, indexingWorker `dbPoolSize=10/dbMaxOverflow=10`
509+
- pool 公式手算(values.yaml L324-333):`sum(replicas * (pool+overflow)) + surge + reserved < max_connections * 0.7`
510+
511+
### 10.2 引入 pgbouncer 的收益
512+
513+
1. **副本扩展不再卡 max_connections**:pgbouncer 把 `client_conn` 复用到固定 `pool_size` 个 server connection,PG max_connections 100 时 ApeRAG 端可以挂 200-500 client。
514+
2. **冷启动 / 拥塞场景更稳**:rolling update 期间多 replica 同时建连不会瞬时打爆 PG。
515+
3. **PG max_connections 不再频繁调高**(避免 PG memory overhead — 每连接 ~10 MB)。
516+
517+
### 10.3 关键决策:pooling mode
518+
519+
| 模式 | 兼容性 | 说明 |
520+
|---|---|---|
521+
| `session pooling` | 100% 兼容 ApeRAG | 每个 client 独占 1 个 server 连接直到 disconnect — 跟没装 pgbouncer 差不多,**不推荐** |
522+
| `transaction pooling`(earayu2 directive) | **需要逐项验证** | 每个 transaction 独占 server 连接 — pool_size 可以远小于 client_conn |
523+
| `statement pooling` | ApeRAG 不兼容 | 跨 statement 不保证同一个 server — 破坏 session 状态 |
524+
525+
**transaction pooling 兼容性 audit checklist**(必须在 PR 前 ✅):
526+
527+
- [ ] **prepared statements**:transaction pooling 不支持跨 transaction 复用 prepared statement。SQLAlchemy 默认不用 server-side prepared,但 asyncpg dialect 有时会启用 → 必须 grep 确认 `prepare_threshold` / `statement_cache_size=0`
528+
- [ ] **`SET LOCAL`**:transaction-scoped,安全。Grep `session.execute("SET LOCAL ...")` 确认所有调用都包在 begin block。
529+
- [ ] **`SET` (session-scoped)**:会跨 transaction 漏到下个 client,必须用 `SET LOCAL`。Grep `session.execute("SET ...")` 看有无 non-LOCAL 用法。
530+
- [ ] **temporary tables**:跨 transaction 不可见,ApeRAG 没用,✅
531+
- [ ] **listen/notify**:跨 transaction 不可见,ApeRAG 没用,✅
532+
- [ ] **advisory locks**:`pg_advisory_lock(...)` 是 session-scoped,会泄漏;必须用 `pg_advisory_xact_lock(...)`(transaction-scoped)。Grep ApeRAG 代码没找到 advisory lock 用法,✅
533+
- [ ] **`reset` 行为**:`pgbouncer.ini` 设 `server_reset_query = DISCARD ALL` 兜底(默认就是这个)
534+
535+
### 10.4 推荐 pgbouncer 参数
536+
537+
```ini
538+
# pgbouncer.ini — 中型私有化(Tier 2)
539+
[databases]
540+
aperag = host=pg-cluster-postgresql-postgresql port=5432 dbname=postgres
541+
542+
[pgbouncer]
543+
pool_mode = transaction
544+
listen_port = 6432
545+
max_client_conn = 500 # ApeRAG 端可以挂 500 client
546+
default_pool_size = 25 # 每 db 默认 25 server connection
547+
reserve_pool_size = 5 # 拥塞时额外 5 个应急
548+
reserve_pool_timeout = 3 # 等 3s 拿不到才走 reserve
549+
server_idle_timeout = 600 # 10 min 空闲就关,让 PG 内存稳定
550+
server_reset_query = DISCARD ALL
551+
ignore_startup_parameters = extra_float_digits,application_name
552+
log_connections = 0 # prod 关掉减压
553+
log_disconnections = 0
554+
```
555+
556+
PG 端配套:
557+
558+
```yaml
559+
# KubeBlocks PG cluster values
560+
postgresql:
561+
parameters:
562+
max_connections: "100" # 内 reserve 给 pgbouncer (25*N + 10 maintenance)
563+
shared_buffers: "2GB" # 25% of 8GB request
564+
work_mem: "16MB" # vector / graph 复杂查询
565+
maintenance_work_mem: "256MB"
566+
max_wal_size: "2GB"
567+
```
568+
569+
### 10.5 ApeRAG 侧改造
570+
571+
```yaml
572+
# values.yaml
573+
api:
574+
dbPoolSize: "20" # 5 → 20(pgbouncer 拿一个,PG 端不感)
575+
dbMaxOverflow: "10"
576+
indexingWorker:
577+
dbPoolSize: "30" # 10 → 30
578+
dbMaxOverflow: "10"
579+
postgres:
580+
POSTGRES_HOST: "pgbouncer-svc" # 走 pgbouncer,不直连 pg-cluster
581+
POSTGRES_PORT: "6432" # pgbouncer 端口
582+
```
583+
584+
ApeRAG 应用层 `pool_size=20+10=30`,4 个 replica = 120 client 连 pgbouncer,pgbouncer pool_size=25 server connection 接 PG。PG max_connections=100,剩余 75 给 KubeBlocks 维护 + 备份 + 监控 + 应急。
585+
586+
**P1-Helm-6** Helm 模板:
587+
588+
1. 增加 `deploy/databases/pgbouncer/values.yaml`(KubeBlocks pgbouncer addon)
589+
2. `deploy/aperag/values.yaml` 加 `postgres.via_pgbouncer: true` 默认 enable
590+
3. `aperag-secret.yaml` 改用 pgbouncer SVC 注入 `DATABASE_URL`
591+
4. 启动 self-check:`SHOW pool_mode` if 走 pgbouncer,必须 == `transaction`
592+
593+
### 10.6 验证
594+
595+
- 单元测试:mock pgbouncer `pool_mode=transaction`,跑全部 `tests/db/` 用例(特别是 `test_collection_regen_lease.py` advisory lock 类)
596+
- 压测:4 replica × `dbPoolSize=30` 同时启动 → pgbouncer 不应该 client_conn 撑爆
597+
- 回归:task #61 P2-S2 N-seed PG connection saturation 复测(已经 ship 的 fix),确认 pgbouncer 引入后还是稳定
598+
599+
---
600+
601+
## 11. 可配化清单(admin UI 接入建议,per earayu2 msg=99c1d23a)
602+
603+
### 11.1 现况
604+
605+
`/admin/configuration` page (`web/src/app/admin/configuration/page.tsx`) 已有:
606+
- `ParserSettings`:`use_markitdown` / `use_mineru` / `mineru_api_token`
607+
- `QuotaSettings`:每用户 / 每 collection 配额
608+
609+
settings 持久化走 `aperag/domains/governance/service/setting_service.py`(DB key-value 表),单条 update 用 `update_setting(key, value)`。
610+
611+
### 11.2 审计找到的新参数 — 建议接入 admin UI
612+
613+
#### 类 A:runtime perf 类(**强烈建议接入** — 非 ops,运维 + 开发都关心)
614+
615+
| 参数 | 当前 default | 建议 admin UI 范围 | 卡片归属 |
616+
|---|---|---|---|
617+
| `embedding_max_chunks_in_batch` | 10 | 8-128 | **新建 IndexingSettings 卡片** |
618+
| `embedding_max_workers` | 1 | 1-8 | IndexingSettings |
619+
| `indexing_vector_concurrency` | 16 | 4-64 | IndexingSettings |
620+
| `indexing_fulltext_concurrency` | 32 | 8-128 | IndexingSettings |
621+
| `indexing_graph_facts_concurrency` | 4 | 1-16 | IndexingSettings |
622+
| `indexing_graph_vectors_concurrency` | 4 | 1-16 | IndexingSettings |
623+
| `indexing_summary_concurrency` | 4 | 1-16 | IndexingSettings |
624+
| `indexing_vision_concurrency` | 4 | 1-16 | IndexingSettings |
625+
| `indexing_parse_concurrency` | 8 | 1-32 | IndexingSettings |
626+
| `indexing_reconcile_interval_seconds` | 30 | 10-300 | IndexingSettings(高级)|
627+
| `indexing_reconcile_batch_size` | 100 | 10-1000 | IndexingSettings(高级)|
628+
| `indexing_cleanup_interval_seconds` | 300 | 60-3600 | IndexingSettings(高级)|
629+
| `chunk_size` | 400 | 100-2000 | ParserSettings(已有卡,加新字段)|
630+
| `chunk_overlap_size` | 20 | 0-200 | ParserSettings |
631+
632+
#### 类 B:collection-level(**已经在 collection config**,admin UI 是修改 default)
633+
634+
| 参数 | 用途 | admin UI 卡片 |
635+
|---|---|---|
636+
| `enable_vector` / `enable_fulltext` / `enable_knowledge_graph` / `enable_summary` / `enable_vision` | 5 lane on/off | 已在 collection 创建页;admin 设全局 default |
637+
| `graph_extraction_window_size` | graph chunk window | 已在 collection.config;admin UI 没必要重复 |
638+
| `graph_extraction_llm_concurrency` | graph LLM 并发(建议 P2-7 抽出) | 同上 |
639+
640+
#### 类 C:infra ops 类(**不接入 admin UI** — 走 Helm/env,避免运行时改动 K8s 配置)
641+
642+
| 参数 | 理由 |
643+
|---|---|
644+
| `db_pool_size` / `db_max_overflow` / `db_pool_timeout` | 改了要重启进程,是部署期参数,不是运行期参数 |
645+
| `indexing_queue_redis_url` / `indexing_quota_redis_url` | 部署期参数 |
646+
| `pgbouncer pool_size / max_client_conn / server_idle_timeout` | pgbouncer 自己的配置,跟 ApeRAG 无关 |
647+
| `K8s resources.requests/limits` | Helm 改完滚动升级 |
648+
| `HPA min/max replicas` | Helm 改完滚动升级 |
649+
| `qdrant_quantization_*` / `qdrant_hnsw_on_disk` | Qdrant collection 创建参数,改了要重建集合 |
650+
| `pgvector_hnsw_m / ef_construction` | 同上 |
651+
| `MAX_DOCUMENT_SIZE` | 上传限制,admin UI 已有 quota 卡,应放 quota 卡而非 indexing 卡 |
652+
653+
### 11.3 推荐 UI 落地
654+
655+
**P2-Admin-1** 新建 `IndexingSettings` 卡片(`web/src/app/admin/configuration/indexing-settings.tsx`):
656+
657+
```
658+
┌─ Indexing Settings ───────────────────────────────────┐
659+
[Embedding]
660+
│ Max chunks per batch: [____10__] (8-128) │
661+
│ Max parallel workers: [_____1__] (1-8) │
662+
│ │
663+
[Worker concurrency] (per modality, asyncio Semaphore) │
664+
│ Vector: [____16__] (4-64) │
665+
│ Fulltext: [____32__] (8-128) │
666+
│ Graph facts: [_____4__] (1-16) │
667+
│ Graph vectors: [_____4__] (1-16) │
668+
│ Summary: [_____4__] (1-16) │
669+
│ Vision: [_____4__] (1-16) │
670+
│ Parse: [_____8__] (1-32) │
671+
│ │
672+
[Advanced: reconciler / cleanup]
673+
│ Reconcile interval (s): [____30__] (10-300) │
674+
│ Reconcile batch size: [___100__] (10-1000) │
675+
│ Cleanup interval (s): [___300__] (60-3600) │
676+
│ Cleanup batch size: [___200__] (10-1000) │
677+
│ │
678+
[⚠️ 改动需要重启 indexing-worker pod 才能生效]
679+
│ │
680+
[Save] [Reset to defaults]
681+
└───────────────────────────────────────────────────────┘
682+
```
683+
684+
**关键 UX 决策**:改动后只入 DB(settings 表),不立即生效。worker 启动时读 settings → 覆盖 env default。这样:
685+
- ops 不需要 kubectl edit values.yaml 滚动升级
686+
- admin UI 改动 → kubectl rollout restart deployment/indexing-worker → 30s 内新值生效
687+
- 兼容当前 env-based 部署:env 优先级 > DB settings(避免 admin UI 误改影响紧急止损)
688+
689+
**P2-Admin-2** Backend changes:
690+
- `aperag/config.py`:每个 indexing_* 参数加一个 helper `get_indexing_setting(key, default)`,启动时优先读 env,env 没有则读 DB settings 表
691+
- `aperag/cli/indexing_worker.py:_amain()` 启动时把 settings 注入 `OrchestratorConfig` / `ParseOrchestratorConfig`
692+
- `aperag/domains/governance/service/setting_service.py` 增加 `get_indexing_settings()` / `update_indexing_settings(...)` helper
693+
- 新建 OpenAPI route `GET/PUT /api/v1/admin/configuration/indexing`(参考已有 `/admin/configuration/parser`)
694+
695+
### 11.4 hook 给前端 (@dongdong)
696+
697+
- 新建 `web/src/features/admin/indexing-settings/`:i18n key `admin_config.indexing.*`,schema validation(zod)
698+
- 加 sidebar 菜单项 `Indexing` 单独卡片或 ParserSettings 同卡分组
699+
- e2e:`web-e2e/admin/configuration-indexing.spec.ts`
700+
701+
---
702+
703+
## 12. (待 @ziang 补充)读路径 + GraphVectors/chunks.jsonl 复用边界
704+
## 13. (待 @ziang 补充)cleanup / deletion / reconciler 批删 SQL
705+
## 14. (待 @ziang 补充)端到端瓶颈归因(你的视角,跟 §5 互补)
706+
## 15. (待 @ziang 补充)KubeBlocks 研究(kubeblocks-skills 仓) → 折进 §10 PG/pgbouncer 章节
707+
## 16. (待联合)验收 checklist + Wave 1-4 排期收口
403708
404709
---
405710
406-
> 文档版本:v1
711+
> 文档版本:v1(§1-§11 by 符炫炜)
407712
> 作者:@符炫炜(架构师)
408-
> 评审:@ziang(待补充 §11-13)
713+
> 待补:@ziang §12-§16
409714
> 验收:@earayu2(msg=718c79ba directive)
410715
> 跟踪:@不穷(PM)
716+
> main HEAD pin: `eb4c4f3d` (2026-04-30 18:46)

0 commit comments

Comments
 (0)