docs: add ai related context

taloric · kylewanginchina · commit dd83be4dd9fb · 2026-04-01T15:09:37.000+08:00
diff --git a/.ai/agents.md b/.ai/agents.md
@@ -0,0 +1,153 @@
+# deepflow-app 项目
+
+## 项目简介
+
+deepflow-app 是 DeepFlow 的后端应用服务，核心功能是**分布式追踪火焰图计算**：将 DeepFlow 采集到的原始 L7 Flow 数据，通过多阶段流水线处理，输出一棵完整的 Span 树（火焰图）。
+
+主要入口：[app/app/application/l7_flow_tracing.py](app/app/application/l7_flow_tracing.py)
+
+---
+
+## 技术约束
+
+### 运行时
+- **Python 3.10**
+- 使用 `pandas.DataFrame` 承载原始 Flow 数据；排序阶段转为 `SpanNode` 对象图
+- 异步框架：Sanic（HTTP 服务层）；业务逻辑为同步计算
+
+### 配置
+- 所有可调参数集中在 [app/app.yaml](app/app.yaml)，运行时通过 [app/app/config.py](app/app/config.py) 的 `config` 单例读取
+- 新增可配置行为时，**必须**同步在 `app.yaml` 中添加注释说明，且默认值应保持向后兼容（opt-in 原则）
+
+### 向后兼容原则
+- **不得破坏已有追踪结果**：新增连接策略时，如果会影响到已有结果，必须默认关闭
+- 修改现有连接条件时，需评估对存量用户火焰图结果的影响
+
+---
+
+## 核心架构：追踪流水线
+
+```
+Search → Merge → Sort → Prune → Statistics
+搜索      合并     排序     裁剪     统计
+```
+
+### 1. 搜索（Search）
+- 以「入口 Flow」为起点，迭代扩展，基于 `trace_id` / `tcp_seq` / `syscall_trace_id` / `x_request_id` 拉取关联 Flow
+- 上限：`config.max_iteration`（默认 30）、`config.l7_tracing_limit`（默认 1000）
+- 控制参数：`config.tracing_source`（列表，控制启用哪些扩展维度）
+
+### 2. 合并（Merge）
+- 将单向 Flow（只有请求或只有响应）合并成完整会话
+- 按 `start_time` 升序，遇到 Response 合并到前置 Request
+
+### 3. 排序（Sort）—— 最复杂阶段
+
+#### SpanSet 构建
+
+| 类型 | 组成 | 关键内部约束 |
+|---|---|---|
+| **NetworkSpanSet** | 0-1 个 c-p + N 个网络 Span + 0-1 个 s-p | ① 所有 Span 的 `tcp_seq` 必须相等；② 流信息（五元组等）必须相等 |
+| **ProcessSpanSet** | 0-1 个 s-p + N 个 App Span + M 个 c-p | ① 所有 Span 的**进程信息**必须相等；② `s-p` 时间必须**完全覆盖** `c-p` |
+
+#### SpanSet 连接（`_connect_process_and_networks`）
+
+连接分两阶段执行：
+
+**准确连接阶段（场景 1-6）**：基于强关联证据，依次执行，结果写入 `network_match_parent`
+
+| 场景 | 连接方向 | 核心条件 |
+|---|---|---|
+| 1 | Process 叶 → NetworkSpanSet 根 | 共享同一 `c-p`，或叶 `span_id` = 网络首 `span_id` |
+| 2 | NetworkSpanSet 尾 → Process 根 | 共享同一 `s-p`，或根 `span_id` = 网络尾 `parent_span_id`/`span_id` |
+| 3 | Process 叶 → Process 根 | 共享同一 `c-p`，或叶 `span_id` = 根 `parent_span_id`/`span_id` |
+| 4 | Process 根 → Process 内任意 | 根 `parent_span_id` = 目标 `span_id` |
+| 5 | NetworkSpanSet → NetworkSpanSet | `x_request_id` 匹配 / `span_id` 相同 / gRPC `stream_id` 相同；前者 `response_duration` ≥ 后者 |
+| 6 | NetworkSpanSet → NetworkSpanSet（异步 MQ） | `trace_id` 有交集；仅限 `is_async` 的 WebSphereMQ；client 开始时间早于 server |
+
+**通用约束（场景 1-5）**：两 Span 不能属于同一 SpanSet；当在同一 Agent 下时，Parent 的时延必须大于 Child。
+
+**弱关联阶段（场景 7）**：独立 pass，结果写入 `weak_match_parent`，不影响准确连接
+
+| 场景 | 条件 | 配置开关 |
+|---|---|---|
+| 7 | client 侧叶（无子）→ server 侧根（无父，`is_net_root=True`）；`trace_id` 有交集且 `tcp_seq` 不同；叶 `response_duration` ≥ 根 | `span_set_connection_strategies: [net_span_c_to_s_via_trace_id]` |
+| 8 | client 侧叶（无子）→ server 侧根（无父，`is_ps_root=True`）；`trace_id` 有交集且 `auto_instance` 和 `auto_instance_type` 不同；叶子和根的 `agent_id` 是同一个，且叶子时间覆盖根 | `span_set_connection_strategies: [sys_span_s_to_c_via_trace_id]` |
+
+### 4. 裁剪（Prune）
+- 存在多棵树时，以「入口 Span 所在的树」为基准，裁剪时钟偏差超出 `host_clock_offset_us` 的树
+- 同 `trace_id` 的 Span 不裁剪；通过强关联（`x_request_id` 等）连接的不裁剪
+
+### 5. 统计（Statistics）
+- 自顶向下计算每个 Span 的自身时延（Parent 减去一级子节点时延之和）
+- 按 AutoService 分组统计服务总时延
+
+---
+
+## Review / Lint 规范
+
+### 添加新连接关系时的必检清单
+
+添加新的 SpanSet 连接场景时（参考 `_connect_process_and_networks`），必须覆盖以下所有开发点：
+
+#### 1. 核心逻辑
+- [ ] 连接条件是否清晰定义（方向、字段、阈值）？
+- [ ] 是否正确判断 `is_net_root` / `is_net_leaf` / `children_count` / `get_parent_id()` 等 SpanNode 状态？
+- [ ] 是否调用 `_same_span_set()` 防止同组首尾互连？
+- [ ] 是否检查 `response_duration` 约束（Parent 时延必须 ≥ Child 时延）？
+- [ ] `set_parent()` 调用时是否传入了合理的 `reason` 字符串（用于调试日志）？
+
+#### 2. 准确 vs 弱关联判断
+- [ ] 新场景属于**准确连接**（强证据：`tcp_seq` / `span_id` / `x_request_id`）还是**弱关联**（推断性）？
+- [ ] 弱关联场景必须：
+  - 写入独立的 `weak_match_parent` dict，在准确连接阶段结束后单独执行
+  - 通过 `config.span_set_connection_strategies` 配置 opt-in，默认不生效
+
+#### 3. 配置
+- [ ] 若新场景需要配置开关，是否在 `app/app/config.py` 的 `parse_spec()` 中读取并赋值到 `config` 对象？
+- [ ] 是否在 `app/app.yaml` 的 `spec` 节添加了带注释的配置项（说明用途、默认值、可选值）？
+- [ ] 默认值是否保持向后兼容（不改变现有用户的追踪结果）？
+
+#### 4. 文档
+- [ ] 是否更新了 [HOW-TO-GET-SPAN-LIKE-DATA.md](HOW-TO-GET-SPAN-LIKE-DATA.md) 的「SpanSet 连接」章节，新增场景的说明（连接方向、条件、配置项）？
+- [ ] 是否更新了 [FlowTracingIssue.md](FlowTracingIssue.md) 中「SpanSet 连接」速查表？
+
+---
+
+### 性能检查
+
+`_connect_process_and_networks` 是 O(N²) 的双层循环，N = 所有 SpanNode 数量（上限受 `l7_tracing_limit` 控制）。添加新场景时：
+
+- [ ] **避免在内层循环中重复计算**：将不变量（如 `get_req_tcp_seq()`、`get_trace_id_set()`）提前缓存到局部变量
+- [ ] **提前剪枝**：将最能过滤候选的条件放在内层循环最前面（fail-fast）
+- [ ] **避免额外数据结构**：不在循环内部创建新 list/dict，若需要，用已有的 `flow_index_to_span` / `related_flow_index_map` 等结构
+- [ ] **评估最坏情况**：在 `l7_tracing_limit=1000` 规模下，新场景的额外循环次数是否可接受？
+
+---
+
+## Python 代码审查
+
+> 本项目为 Python，建议在完成较大改动后做专项审查。
+
+检查要点：PEP 8 合规性、类型注解、Pythonic 惯用法、潜在的性能问题和安全风险。
+
+---
+
+## 关键文件索引
+
+| 文件 | 作用 |
+|---|---|
+| [app/app/application/l7_flow_tracing.py](app/app/application/l7_flow_tracing.py) | 追踪流水线全部核心逻辑 |
+| [app/app/config.py](app/app/config.py) | 配置解析，`config` 单例 |
+| [app/app.yaml](app/app.yaml) | 配置文件模板及注释 |
+| [HOW-TO-GET-SPAN-LIKE-DATA.md](HOW-TO-GET-SPAN-LIKE-DATA.md) | 追踪计算原理的完整设计文档 |
+| [FlowTracingIssue.md](FlowTracingIssue.md) | 断链诊断知识库与速查表 |
+
+## 关键函数索引
+
+| 函数 | 位置 | 说明 |
+|---|---|---|
+| `_connect_process_and_networks` | l7_flow_tracing.py ~L3242 | SpanSet 连接，7 个场景 |
+| `_same_span_set` | l7_flow_tracing.py ~L3235 | 防止同组首尾互连的工具函数 |
+| `merge_flow` | l7_flow_tracing.py ~L1845 | 合并单向 Flow 为会话 |
+| `Config.parse_spec` | config.py L17 | 解析 `spec` 配置节 |
diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md
@@ -0,0 +1 @@
+../.ai/agents.md
diff --git a/.codex/agents.md b/.codex/agents.md
@@ -0,0 +1 @@
+../.ai/agents.md
diff --git a/.serena/.gitignore b/.serena/.gitignore
@@ -0,0 +1 @@
+/cache
diff --git a/.serena/project.yml b/.serena/project.yml
@@ -0,0 +1,146 @@
+# list of languages for which language servers are started; choose from:
+#   al                  bash                clojure             cpp                 csharp
+#   csharp_omnisharp    dart                elixir              elm                 erlang
+#   fortran             fsharp              go                  groovy              haskell
+#   java                julia               kotlin              lua                 markdown
+#   matlab              nix                 pascal              perl                php
+#   powershell          python              python_jedi         r                   rego
+#   ruby                ruby_solargraph     rust                scala               swift
+#   terraform           toml                typescript          typescript_vts      vue
+#   yaml                zig
+#   (This list may be outdated. For the current list, see values of Language enum here:
+#   https://github.com/oraios/serena/blob/main/src/solidlsp/ls_config.py
+#   For some languages, there are alternative language servers, e.g. csharp_omnisharp, ruby_solargraph.)
+# Note:
+#   - For C, use cpp
+#   - For JavaScript, use typescript
+#   - For Free Pascal/Lazarus, use pascal
+# Special requirements:
+#   - csharp: Requires the presence of a .sln file in the project folder.
+#   - pascal: Requires Free Pascal Compiler (fpc) and optionally Lazarus.
+# When using multiple languages, the first language server that supports a given file will be used for that file.
+# The first language is the default language and the respective language server will be used as a fallback.
+# Note that when using the JetBrains backend, language servers are not used and this list is correspondingly ignored.
+languages:
+- python
+
+# the encoding used by text files in the project
+# For a list of possible encodings, see https://docs.python.org/3.11/library/codecs.html#standard-encodings
+encoding: "utf-8"
+
+# whether to use project's .gitignore files to ignore files
+ignore_all_files_in_gitignore: true
+
+# list of additional paths to ignore in all projects
+# same syntax as gitignore, so you can use * and **
+ignored_paths: []
+
+# whether the project is in read-only mode
+# If set to true, all editing tools will be disabled and attempts to use them will result in an error
+# Added on 2025-04-18
+read_only: false
+
+# list of tool names to exclude. We recommend not excluding any tools, see the readme for more details.
+# Below is the complete list of tools for convenience.
+# To make sure you have the latest list of tools, and to view their descriptions, 
+# execute `uv run scripts/print_tool_overview.py`.
+#
+#  * `activate_project`: Activates a project by name.
+#  * `check_onboarding_performed`: Checks whether project onboarding was already performed.
+#  * `create_text_file`: Creates/overwrites a file in the project directory.
+#  * `delete_lines`: Deletes a range of lines within a file.
+#  * `delete_memory`: Deletes a memory from Serena's project-specific memory store.
+#  * `execute_shell_command`: Executes a shell command.
+#  * `find_referencing_code_snippets`: Finds code snippets in which the symbol at the given location is referenced.
+#  * `find_referencing_symbols`: Finds symbols that reference the symbol at the given location (optionally filtered by type).
+#  * `find_symbol`: Performs a global (or local) search for symbols with/containing a given name/substring (optionally filtered by type).
+#  * `get_current_config`: Prints the current configuration of the agent, including the active and available projects, tools, contexts, and modes.
+#  * `get_symbols_overview`: Gets an overview of the top-level symbols defined in a given file.
+#  * `initial_instructions`: Gets the initial instructions for the current project.
+#     Should only be used in settings where the system prompt cannot be set,
+#     e.g. in clients you have no control over, like Claude Desktop.
+#  * `insert_after_symbol`: Inserts content after the end of the definition of a given symbol.
+#  * `insert_at_line`: Inserts content at a given line in a file.
+#  * `insert_before_symbol`: Inserts content before the beginning of the definition of a given symbol.
+#  * `list_dir`: Lists files and directories in the given directory (optionally with recursion).
+#  * `list_memories`: Lists memories in Serena's project-specific memory store.
+#  * `onboarding`: Performs onboarding (identifying the project structure and essential tasks, e.g. for testing or building).
+#  * `prepare_for_new_conversation`: Provides instructions for preparing for a new conversation (in order to continue with the necessary context).
+#  * `read_file`: Reads a file within the project directory.
+#  * `read_memory`: Reads the memory with the given name from Serena's project-specific memory store.
+#  * `remove_project`: Removes a project from the Serena configuration.
+#  * `replace_lines`: Replaces a range of lines within a file with new content.
+#  * `replace_symbol_body`: Replaces the full definition of a symbol.
+#  * `restart_language_server`: Restarts the language server, may be necessary when edits not through Serena happen.
+#  * `search_for_pattern`: Performs a search for a pattern in the project.
+#  * `summarize_changes`: Provides instructions for summarizing the changes made to the codebase.
+#  * `switch_modes`: Activates modes by providing a list of their names
+#  * `think_about_collected_information`: Thinking tool for pondering the completeness of collected information.
+#  * `think_about_task_adherence`: Thinking tool for determining whether the agent is still on track with the current task.
+#  * `think_about_whether_you_are_done`: Thinking tool for determining whether the task is truly completed.
+#  * `write_memory`: Writes a named memory (for future reference) to Serena's project-specific memory store.
+excluded_tools: []
+
+# initial prompt for the project. It will always be given to the LLM upon activating the project
+# (contrary to the memories, which are loaded on demand).
+initial_prompt: ""
+# the name by which the project can be referenced within Serena
+project_name: "deepflow-app"
+
+# list of tools to include that would otherwise be disabled (particularly optional tools that are disabled by default)
+included_optional_tools: []
+
+# list of mode names to that are always to be included in the set of active modes
+# The full set of modes to be activated is base_modes + default_modes.
+# If the setting is undefined, the base_modes from the global configuration (serena_config.yml) apply.
+# Otherwise, this setting overrides the global configuration.
+# Set this to [] to disable base modes for this project.
+# Set this to a list of mode names to always include the respective modes for this project.
+base_modes:
+
+# list of mode names that are to be activated by default.
+# The full set of modes to be activated is base_modes + default_modes.
+# If the setting is undefined, the default_modes from the global configuration (serena_config.yml) apply.
+# Otherwise, this overrides the setting from the global configuration (serena_config.yml).
+# This setting can, in turn, be overridden by CLI parameters (--mode).
+default_modes:
+
+# fixed set of tools to use as the base tool set (if non-empty), replacing Serena's default set of tools.
+# This cannot be combined with non-empty excluded_tools or included_optional_tools.
+fixed_tools: []
+
+# time budget (seconds) per tool call for the retrieval of additional symbol information
+# such as docstrings or parameter information.
+# This overrides the corresponding setting in the global configuration; see the documentation there.
+# If null or missing, use the setting from the global configuration.
+symbol_info_budget:
+
+# The language backend to use for this project.
+# If not set, the global setting from serena_config.yml is used.
+# Valid values: LSP, JetBrains
+# Note: the backend is fixed at startup. If a project with a different backend
+# is activated post-init, an error will be returned.
+language_backend:
+
+# list of regex patterns which, when matched, mark a memory entry as read‑only.
+# Extends the list from the global configuration, merging the two lists.
+read_only_memory_patterns: []
+
+# line ending convention to use when writing source files.
+# Possible values: unset (use global setting), "lf", "crlf", or "native" (platform default)
+# This does not affect Serena's own files (e.g. memories and configuration files), which always use native line endings.
+line_ending:
+
+# list of regex patterns for memories to completely ignore.
+# Matching memories will not appear in list_memories or activate_project output
+# and cannot be accessed via read_memory or write_memory.
+# To access ignored memory files, use the read_file tool on the raw file path.
+# Extends the list from the global configuration, merging the two lists.
+# Example: ["_archive/.*", "_episodes/.*"]
+ignored_memory_patterns: []
+
+# advanced configuration option allowing to configure language server-specific options.
+# Maps the language key to the options.
+# Have a look at the docstring of the constructors of the LS implementations within solidlsp (e.g., for C# or PHP) to see which options are available.
+# No documentation on options means no options are available.
+ls_specific_settings: {}
diff --git a/requirements3.txt b/requirements3.txt
@@ -0,0 +1,34 @@
+aiofiles==23.2.1
+aiohttp==3.10.11
+aiosignal==1.3.1
+async-timeout==4.0.3
+attrs==23.1.0
+Deprecated==1.2.14
+frozenlist==1.4.0
+html5tagger==1.3.0
+httptools==0.6.1
+idna==3.7
+importlib-metadata==6.8.0
+multidict==6.0.4
+numpy==1.24.4
+opentelemetry-api==1.21.0
+opentelemetry-sdk==1.21.0
+opentelemetry-semantic-conventions==0.42b0
+pandas==2.0.3
+python-dateutil==2.8.2
+pytz==2023.3.post1
+PyYAML==6.0.1
+sanic==23.6.0
+sanic-routing==23.6.0
+schematics==2.1.1
+six==1.16.0
+tracerite==1.1.1
+typing_extensions==4.8.0
+tzdata==2023.3
+ujson==5.8.0
+uvloop==0.19.0
+websockets==12.0
+wrapt==1.16.0
+yarl==1.12.1
+zipp==3.19.2
+setuptools==68.1.2