-
Notifications
You must be signed in to change notification settings - Fork 664
🐛 Bugfix: Fixed an issue where the MCP service failed to start in a Kubernetes pod. #3254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,6 +8,7 @@ | |
| import asyncio | ||
| import logging | ||
| import socket | ||
| import re | ||
| import uuid | ||
|
|
||
| import kubernetes | ||
|
|
@@ -23,6 +24,47 @@ | |
|
|
||
| logger = logging.getLogger("nexent.container.kubernetes") | ||
|
|
||
| # Kubernetes naming constraints: lowercase alphanumeric or dash, cannot start/end with dash, | ||
| # cannot have consecutive dashes, max 253 characters | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P2] 注释写 max 253,但这个 sanitizer 没有实际 enforce 长度,而且很多使用点是 label value/名称片段,常见上限是 63。注释会让调用方误以为长度已经安全,建议把长度作为参数并在函数内截断。
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P2] 注释写 max 253,但这个 sanitizer 没有实际 enforce 长度,而且很多使用点是 label value/名称片段,常见上限是 63。注释会让调用方误以为长度已经安全,建议把长度作为参数并在函数内截断。
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 事后审查补充:[P2] 注释写 max 253,但这个 sanitizer 没有实际 enforce 长度,而且很多使用点是 label value/名称片段,常见上限是 63。注释会让调用方误以为长度已经安全,建议把长度作为参数并在函数内截断。 影响:这个问题合入后会在对应部署、运行或权限场景中留下真实故障/安全风险,后续排查成本较高。
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 二次事后审查补充:[P2] 注释写 max 253,但这个 sanitizer 没有实际 enforce 长度,而且很多使用点是 label value/名称片段,常见上限是 63。注释会让调用方误以为长度已经安全,建议把长度作为参数并在函数内截断。 影响:该问题合入后仍可能在真实部署、运行、权限或测试场景中形成回归风险。 |
||
| K8S_NAME_PATTERN = re.compile(r"[^a-z0-9-]+") | ||
| K8S_CONSECUTIVE_DASHES = re.compile(r"-+") | ||
|
|
||
|
|
||
| def _sanitize_k8s_name(name: str) -> str: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| """Convert arbitrary string to valid Kubernetes resource name. | ||
|
|
||
| Rules: | ||
| - Convert to lowercase | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| - Replace invalid characters with dash | ||
| - Collapse consecutive dashes | ||
| - Remove leading/trailing dashes | ||
| - Must start with alphanumeric | ||
|
|
||
| Args: | ||
| name: Input string to sanitize | ||
|
|
||
| Returns: | ||
| Valid Kubernetes name (lowercase alphanumeric and dashes only) | ||
| """ | ||
| if not name: | ||
| return "unknown" | ||
|
|
||
| # Lowercase and replace invalid chars with dash | ||
| sanitized = K8S_NAME_PATTERN.sub("-", name.lower()) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P2] name 标注为 str,但测试又传 None;如果传入 truthy 的非字符串对象,name.lower() 会抛 AttributeError。边界入口应先 str(name) 或显式拒绝非字符串。
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P2] name 标注为 str,但测试又传 None;如果传入 truthy 的非字符串对象,name.lower() 会抛 AttributeError。边界入口应先 str(name) 或显式拒绝非字符串。
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 事后审查补充:[P2] name 标注为 str,但测试又传 None;如果传入 truthy 的非字符串对象,name.lower() 会抛 AttributeError。边界入口应先 str(name) 或显式拒绝非字符串。 影响:这个问题合入后会在对应部署、运行或权限场景中留下真实故障/安全风险,后续排查成本较高。
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 二次事后审查补充:[P2] name 标注为 str,但测试又传 None;如果传入 truthy 的非字符串对象,name.lower() 会抛 AttributeError。边界入口应先 str(name) 或显式拒绝非字符串。 影响:该问题合入后仍可能在真实部署、运行、权限或测试场景中形成回归风险。 |
||
|
|
||
| # Collapse consecutive dashes | ||
| sanitized = K8S_CONSECUTIVE_DASHES.sub("-", sanitized) | ||
|
|
||
| # Remove leading/trailing dashes | ||
| sanitized = sanitized.strip("-") | ||
|
|
||
| # Ensure it starts with alphanumeric | ||
| if sanitized and not sanitized[0].isalnum(): | ||
| sanitized = "x" + sanitized | ||
|
|
||
| # Fallback if empty | ||
| return sanitized if sanitized else "unknown" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P1] 返回值没有任何长度限制,长 service_name 经过清洗后仍然可以非常长,最终 pod name 会超过 Kubernetes 限制。这里应该在 sanitizer 或 _generate_pod_name 中截断并附稳定 hash。
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P1] 返回值没有任何长度限制,长 service_name 经过清洗后仍然可以非常长,最终 pod name 会超过 Kubernetes 限制。这里应该在 sanitizer 或 _generate_pod_name 中截断并附稳定 hash。
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 事后审查补充:[P1] 返回值没有任何长度限制,长 service_name 经过清洗后仍然可以非常长,最终 pod name 会超过 Kubernetes 限制。这里应该在 sanitizer 或 _generate_pod_name 中截断并附稳定 hash。 影响:这个问题合入后会在对应部署、运行或权限场景中留下真实故障/安全风险,后续排查成本较高。
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 二次事后审查补充:[P1] 返回值没有任何长度限制,长 service_name 经过清洗后仍然可以非常长,最终 pod name 会超过 Kubernetes 限制。这里应该在 sanitizer 或 _generate_pod_name 中截断并附稳定 hash。 影响:该问题合入后仍可能在真实部署、运行、权限或测试场景中形成回归风险。 |
||
|
|
||
|
|
||
| class ContainerError(Exception): | ||
| """Raised when container operation fails""" | ||
|
|
@@ -77,9 +119,9 @@ def __init__(self, config: KubernetesContainerConfig): | |
|
|
||
| def _generate_pod_name(self, service_name: str, tenant_id: str, user_id: str) -> str: | ||
| """Generate unique pod name with service, tenant, and user segments.""" | ||
| safe_name = "".join(c if c.isalnum() or c == "-" else "-" for c in service_name) | ||
| tenant_part = (tenant_id or "")[:8] | ||
| user_part = (user_id or "")[:8] | ||
| safe_name = _sanitize_k8s_name(service_name) | ||
| tenant_part = _sanitize_k8s_name(tenant_id)[:8] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P2] tenant_part/user_part 只取前 8 位,tenantabcdef 和 tenantabcxyz 会生成同样前缀,排障和清理时容易误判。建议用前缀+短 hash,而不是纯截断。
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P2] tenant_part/user_part 只取前 8 位,tenantabcdef 和 tenantabcxyz 会生成同样前缀,排障和清理时容易误判。建议用前缀+短 hash,而不是纯截断。
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 事后审查补充:[P2] tenant_part/user_part 只取前 8 位,tenantabcdef 和 tenantabcxyz 会生成同样前缀,排障和清理时容易误判。建议用前缀+短 hash,而不是纯截断。 影响:这个问题合入后会在对应部署、运行或权限场景中留下真实故障/安全风险,后续排查成本较高。
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 二次事后审查补充:[P2] tenant_part/user_part 只取前 8 位,tenantabcdef 和 tenantabcxyz 会生成同样前缀,排障和清理时容易误判。建议用前缀+短 hash,而不是纯截断。 影响:该问题合入后仍可能在真实部署、运行、权限或测试场景中形成回归风险。 |
||
| user_part = _sanitize_k8s_name(user_id)[:8] | ||
| uuid_part = uuid.uuid4().hex[:8] | ||
| return f"mcp-{safe_name}-{tenant_part}-{user_part}-{uuid_part}" | ||
|
|
||
|
|
@@ -486,7 +528,7 @@ def list_containers( | |
|
|
||
| # Filter by service_name if provided | ||
| if service_name: | ||
| safe_name = "".join(c if c.isalnum() or c == "-" else "-" for c in service_name) | ||
| safe_name = _sanitize_k8s_name(service_name) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P2] list_containers 用 sanitize 后的 safe_name 再做子串匹配,service=api 会匹配到 my-api-v2 等不相关 pod。应和创建时的 component label 做精确匹配,或保存完整 sanitized service_name label。
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P2] list_containers 用 sanitize 后的 safe_name 再做子串匹配,service=api 会匹配到 my-api-v2 等不相关 pod。应和创建时的 component label 做精确匹配,或保存完整 sanitized service_name label。
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 事后审查补充:[P2] list_containers 用 sanitize 后的 safe_name 再做子串匹配,service=api 会匹配到 my-api-v2 等不相关 pod。应和创建时的 component label 做精确匹配,或保存完整 sanitized service_name label。 影响:这个问题合入后会在对应部署、运行或权限场景中留下真实故障/安全风险,后续排查成本较高。
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 二次事后审查补充:[P2] list_containers 用 sanitize 后的 safe_name 再做子串匹配,service=api 会匹配到 my-api-v2 等不相关 pod。应和创建时的 component label 做精确匹配,或保存完整 sanitized service_name label。 影响:该问题合入后仍可能在真实部署、运行、权限或测试场景中形成回归风险。 |
||
| pod_component = labels.get(self.LABEL_COMPONENT, "") | ||
| if safe_name not in pod_component: | ||
| continue | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[逻辑漏洞]
_sanitize_k8s_name函数在注释中提到了 Kubernetes 命名约束(最大 253 字符),但实际代码中未对长度进行截断。如果输入名称超过 253 字符,生成的 K8s 资源名称将无效,导致创建失败。建议在函数末尾添加长度截断逻辑:sanitized = sanitized[:253].rstrip('-')。