Skip to content

Commit d5390cc

Browse files
authored
feat: container build context config + full build log preservation (#1110)
<h3>PR Summary by Qodo</h3> Add container build context option and preserve full redacted build logs <code>✨ Enhancement</code> <code>🐞 Bug fix</code> <code>🧪 Tests</code> <code>⚙️ Configuration changes</code> <code>📝 Documentation</code> <code>🕐 20-40 Minutes</code> <img src="https://www.qodo.ai/wp-content/uploads/2025/11/light-grey-line.svg" height="10%" alt="Grey Divider"> <h3>Walkthroughs</h3> <details open> <summary>User Description</summary> <br/> ## Summary Two container build improvements: ### Full build log preservation (#1090) - Log full redacted build output before truncation for debugging - Extract `_redact_output()` helper for reusable secret redaction - Truncation threshold uses ANSI-stripped lengths consistently ### Docker build context from non-root directory (#1065) - Add `context` config option under `container` for non-root build context - Path traversal validation prevents context escaping worktree root - Schema pattern constraint rejects obviously bad values Closes #1090 Closes #1065 </details> <details open> <summary>AI Description</summary> <br/> <pre> • Add <b><i>container.context</i></b> config to build images from a repo subdirectory. • Validate context stays within the checked-out worktree to prevent path traversal. • Log full redacted build output before GitHub check-run truncation for debugging. </pre> </details> <details> <summary>Diagram</summary> <br/> ```mermaid graph TD cfg["Container config"] --> loader["Config loader"] --> runner["RunnerHandler"] --> podman["Podman build"] runner --> check["CheckRunHandler"] --> gh["GitHub Checks API"] ``` </details> <details> <summary>High-Level Assessment</summary> <br/> The following are alternative approaches to this PR: <details> <summary><b>1. Use commonpath/is_relative_to for context validation</b></summary> - ➕ Less error-prone than string prefix checks - ➕ More readable intent: “resolved path is within worktree” - ➖ `Path.is_relative_to` requires Python 3.9+; `commonpath` still needs careful normalization/symlink handling </details> <details> <summary><b>2. Persist full build logs as artifacts instead of debug logs</b></summary> - ➕ Avoids flooding debug logs while still preserving full output - ➕ Makes large logs easier to browse/download - ➖ Requires artifact storage/retention and plumbing (upload, linking, cleanup) - ➖ More moving parts than a server-side debug log </details> **Recommendation:** The PR’s approach is reasonable: config-driven context selection with realpath-based containment checks, plus pre-truncation redacted logging gated on truncation. If context validation edge cases become a concern, consider switching to `os.path.commonpath` (or `Path.resolve().is_relative_to()` where supported) to express containment more directly. </details> <img src="https://www.qodo.ai/wp-content/uploads/2025/11/light-grey-line.svg" height="10%" alt="Grey Divider"> <h3>File Changes</h3> <details> <summary><strong>Enhancement</strong> (2)</summary> <blockquote> <details> <summary><strong>github_api.py</strong> <code>Read container context from repo config into webhook runtime</code> <code>+1/-0</code></summary> <br/> >Read container context from repo config into webhook runtime > ><pre> >• Extends repository config ingestion to populate &#x27;container_context&#x27; from &#x27;container.context&#x27;, defaulting to empty string when unset. ></pre> > ><a href='https://github.com/myk-org/github-webhook-server/pull/1110/files#diff-7c5f6dfcadb38e75c2d0f1d418ba1a861cc9f6c0efe72905a250e9f43a6cfdcf'>webhook_server/libs/github_api.py</a> <hr/> </details> </blockquote> <blockquote> <details> <summary><strong>runner_handler.py</strong> <code>Support building containers from a configured subdirectory context</code> <code>+26/-6</code></summary> <br/> >Support building containers from a configured subdirectory context > ><pre> >• Uses &#x27;container_context&#x27; to select the podman build context directory while keeping the Dockerfile path rooted at the worktree. Adds realpath-based validation to reject contexts that resolve outside the checked-out worktree and fails the check run instead of building. ></pre> > ><a href='https://github.com/myk-org/github-webhook-server/pull/1110/files#diff-0cb54c95cafda12d8d169c7b03ac484738f4cf925c22f6e6b8b8c5db0730ce42'>webhook_server/libs/handlers/runner_handler.py</a> <hr/> </details> </blockquote> </details> <details> <summary><strong>Bug fix</strong> (1)</summary> <blockquote> <details> <summary><strong>check_run_handler.py</strong> <code>Preserve full redacted output in logs before check-run truncation</code> <code>+38/-18</code></summary> <br/> >Preserve full redacted output in logs before check-run truncation > ><pre> >• Extracts a reusable &#x27;_redact_output()&#x27; helper and uses it both for returned check-run text and for a new debug log that captures the full redacted output when truncation is required. Truncation detection is based on ANSI-stripped lengths to match what is sent to GitHub. ></pre> > ><a href='https://github.com/myk-org/github-webhook-server/pull/1110/files#diff-6b505edd8695d646f7fec8e71517af3e42df16342d1409a74c8212063ab26f56'>webhook_server/libs/handlers/check_run_handler.py</a> <hr/> </details> </blockquote> </details> <details> <summary><strong>Tests</strong> (3)</summary> <blockquote> <details> <summary><strong>test_check_run_handler.py</strong> <code>Add tests for pre-truncation full-output logging and redaction</code> <code>+53/-0</code></summary> <br/> >Add tests for pre-truncation full-output logging and redaction > ><pre> >• Adds coverage ensuring the full output debug log is emitted only when truncation occurs and that secrets are redacted in the logged content. ></pre> > ><a href='https://github.com/myk-org/github-webhook-server/pull/1110/files#diff-ab2e5c2c544a42d78004d8aa40f768b38426f08f365090a75fdb5e4fad57a7cc'>webhook_server/tests/test_check_run_handler.py</a> <hr/> </details> </blockquote> <blockquote> <details> <summary><strong>test_repo_data_from_config.py</strong> <code>Test container context config parsing and defaults</code> <code>+27/-0</code></summary> <br/> >Test container context config parsing and defaults > ><pre> >• Asserts &#x27;container_context&#x27; defaults to empty string and is correctly read when supplied under the &#x27;container&#x27; config block. ></pre> > ><a href='https://github.com/myk-org/github-webhook-server/pull/1110/files#diff-4e76f4b4427b5be5ab45623aa0f35a24a51b526903e36dbf3ea347757ae1cc62'>webhook_server/tests/test_repo_data_from_config.py</a> <hr/> </details> </blockquote> <blockquote> <details> <summary><strong>test_runner_handler.py</strong> <code>Test build context selection and traversal rejection in container builds</code> <code>+96/-0</code></summary> <br/> >Test build context selection and traversal rejection in container builds > ><pre> >• Adds tests verifying the podman command uses the configured subdirectory as context, rejects traversal attempts that escape the worktree, and defaults to repo root when context is empty. ></pre> > ><a href='https://github.com/myk-org/github-webhook-server/pull/1110/files#diff-beb446b71d75b7a427d17e48e75971c270de4820cb9786a12af9e4f988887269'>webhook_server/tests/test_runner_handler.py</a> <hr/> </details> </blockquote> </details> <details> <summary><strong>Documentation</strong> (1)</summary> <blockquote> <details> <summary><strong>config.yaml</strong> <code>Document optional container build context setting</code> <code>+1/-0</code></summary> <br/> >Document optional container build context setting > ><pre> >• Adds an example/commented &#x27;context&#x27; key under &#x27;container&#x27; to demonstrate building from a subdirectory instead of repo root. ></pre> > ><a href='https://github.com/myk-org/github-webhook-server/pull/1110/files#diff-e3ac27ce128a0ddae53723bc8c3133257530a13c00f97c9edfeef152ebc3b8ce'>examples/config.yaml</a> <hr/> </details> </blockquote> </details> <details> <summary><strong>Other</strong> (1)</summary> <blockquote> <details> <summary><strong>schema.yaml</strong> <code>Add &#x27;container.context&#x27; to configuration schema with input constraints</code> <code>+8/-0</code></summary> <br/> >Add &#x27;container.context&#x27; to configuration schema with input constraints > ><pre> >• Introduces a &#x27;context&#x27; string option under &#x27;container&#x27; with a conservative allowed-character pattern and a default of empty string (repo root). ></pre> > ><a href='https://github.com/myk-org/github-webhook-server/pull/1110/files#diff-0eaf85d7f2a5888c61710a31c06434f63fe254f177f3df114332204780388f67'>webhook_server/config/schema.yaml</a> <hr/> </details> </blockquote> </details> <img src="https://www.qodo.ai/wp-content/uploads/2025/11/light-grey-line.svg" height="10%" alt="Grey Divider"> <a href="https://www.qodo.ai"><img src="https://www.qodo.ai/wp-content/uploads/2025/03/qodo-logo.svg" width="80" alt="Qodo Logo"></a>
1 parent 23028c4 commit d5390cc

8 files changed

Lines changed: 262 additions & 30 deletions

File tree

examples/config.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,7 @@ repositories:
175175
repository: <registry_repository_full_path>
176176
tag: <image_tag>
177177
release: true # Push image to registry on new release with release as the tag
178+
# context: src # Subdirectory to use as Docker build context (default: repo root)
178179
build-args: # build args to send to podman build command
179180
- my-build-arg1=1
180181
- my-build-arg2=2

webhook_server/config/schema.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -392,6 +392,14 @@ properties:
392392
items:
393393
type: string
394394
description: Additional arguments for build command
395+
context:
396+
type: string
397+
pattern: "^[a-zA-Z0-9._/-]*$"
398+
description: |
399+
Subdirectory to use as the Docker build context (relative to repo root).
400+
Default is empty string (repo root). Example: "src" builds from <repo>/src/.
401+
Only alphanumeric characters, dots, hyphens, underscores, and forward slashes allowed.
402+
default: ""
395403
oci-annotations:
396404
type: object
397405
description: OCI image annotations (https://github.com/opencontainers/image-spec/blob/main/annotations.md)

webhook_server/libs/github_api.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -818,6 +818,7 @@ def _repo_data_from_config(self, repository_config: dict[str, Any]) -> None:
818818
self.container_repository_password: str = self.build_and_push_container["password"]
819819
self.container_repository: str = self.build_and_push_container["repository"]
820820
self.dockerfile: str = self.build_and_push_container.get("dockerfile", "Dockerfile")
821+
self.container_context: str = self.build_and_push_container.get("context", "")
821822
self.container_tag: str = self.build_and_push_container.get("tag", "latest")
822823
_build_args = self.build_and_push_container.get("build-args", [])
823824
_cmd_args = self.build_and_push_container.get("args", [])

webhook_server/libs/handlers/check_run_handler.py

Lines changed: 44 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import asyncio
2+
import logging
23
from typing import TYPE_CHECKING, Any, NotRequired, TypedDict
34

45
from github.CheckRun import CheckRun
@@ -202,17 +203,57 @@ async def set_check_run_status(
202203
log_prefix=self.log_prefix,
203204
)
204205

206+
def _redact_output(self, text: str) -> str:
207+
"""Replace sensitive tokens, passwords, and credentials with *****."""
208+
_hased_str = "*****"
209+
210+
pypi_config = self.github_webhook.pypi
211+
if isinstance(pypi_config, dict):
212+
pypi_token = pypi_config.get("token")
213+
if pypi_token:
214+
text = text.replace(pypi_token, _hased_str)
215+
216+
if getattr(self.github_webhook, "container_repository_username", None):
217+
text = text.replace(self.github_webhook.container_repository_username, _hased_str)
218+
219+
if getattr(self.github_webhook, "container_repository_password", None):
220+
text = text.replace(self.github_webhook.container_repository_password, _hased_str)
221+
222+
if self.github_webhook.token:
223+
text = text.replace(self.github_webhook.token, _hased_str)
224+
225+
return text
226+
205227
def get_check_run_text(self, err: str, out: str) -> str:
206228
# Strip ANSI escape codes from output to prevent scrambled characters in GitHub UI
207229
err_clean = strip_ansi_codes(err)
208230
out_clean = strip_ansi_codes(out)
209231

232+
# Redact BEFORE truncation to prevent partial secret leaks at truncation boundaries
233+
err_clean = self._redact_output(err_clean)
234+
out_clean = self._redact_output(out_clean)
235+
210236
# GitHub limit is 65535 characters, but we use 65534 to be safe
211237
# We reserve space for the markdown wrapper: ```\n{err}\n\n{out}\n```
212238
# Wrapper overhead = len("```\n\n\n```") = 10 characters
213239
MAX_LEN = 65534
214240
WRAPPER_OVERHEAD = 10
215241

242+
# Check if truncation will be needed (using ANSI-stripped, redacted lengths)
243+
full_clean_len = len(err_clean) + len(out_clean) + WRAPPER_OVERHEAD
244+
will_truncate = full_clean_len > MAX_LEN
245+
246+
# Log the full redacted output BEFORE truncation so server logs preserve
247+
# the complete build output for debugging.
248+
# err_clean/out_clean are already redacted at this point.
249+
if will_truncate and self.logger.isEnabledFor(logging.DEBUG):
250+
full_output = f"```\n{err_clean}\n\n{out_clean}\n```"
251+
self.logger.debug(
252+
f"{self.log_prefix} Full check run output ({full_clean_len} chars, "
253+
f"will be truncated to {MAX_LEN} for GitHub). "
254+
f"Redacted full output:\n{full_output}"
255+
)
256+
216257
# Prepare error part first - we want to preserve it as much as possible
217258
# If error itself is huge, we might need to truncate it too, but usually it's small
218259
# If error + wrapper > MAX_LEN, we truncate error
@@ -241,24 +282,9 @@ def get_check_run_text(self, err: str, out: str) -> str:
241282

242283
_output = f"```\n{err_clean}\n\n{out_clean}\n```"
243284

244-
_hased_str = "*****"
245-
246-
pypi_config = self.github_webhook.pypi
247-
if isinstance(pypi_config, dict):
248-
pypi_token = pypi_config.get("token")
249-
if pypi_token:
250-
_output = _output.replace(pypi_token, _hased_str)
251-
252-
if getattr(self.github_webhook, "container_repository_username", None):
253-
_output = _output.replace(self.github_webhook.container_repository_username, _hased_str)
254-
255-
if getattr(self.github_webhook, "container_repository_password", None):
256-
_output = _output.replace(self.github_webhook.container_repository_password, _hased_str)
257-
258-
if self.github_webhook.token:
259-
_output = _output.replace(self.github_webhook.token, _hased_str)
260-
261-
return _output
285+
# Final redaction pass is idempotent but catches any secrets
286+
# introduced by the markdown wrapper itself (defensive)
287+
return self._redact_output(_output)
262288

263289
async def is_check_run_in_progress(self, check_run: str) -> bool:
264290
if self.github_webhook.last_commit:

webhook_server/libs/handlers/runner_handler.py

Lines changed: 32 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -395,11 +395,42 @@ async def run_build_container(
395395
is_merged=is_merged,
396396
tag_name=tag,
397397
) as (success, worktree_path, out, err):
398+
output: CheckRunOutput = {
399+
"title": "Build container",
400+
"summary": "",
401+
"text": None,
402+
}
403+
404+
if not success:
405+
output["text"] = self.check_run_handler.get_check_run_text(out=out, err=err)
406+
if pull_request and set_check:
407+
await self.check_run_handler.set_check_failure(name=BUILD_CONTAINER_STR, output=output)
408+
return
409+
398410
# Build container build command with worktree path
411+
# Use configured context subdirectory for build context (default: repo root)
412+
_context = self.github_webhook.container_context
413+
if _context:
414+
resolved_context = os.path.realpath(os.path.join(worktree_path, _context))
415+
resolved_worktree = os.path.realpath(worktree_path)
416+
is_under_worktree = resolved_context.startswith(resolved_worktree + os.sep)
417+
if not is_under_worktree and resolved_context != resolved_worktree:
418+
self.logger.error(
419+
f"{self.log_prefix} Container context '{_context}' resolves outside "
420+
f"worktree ({resolved_context}), rejecting for security"
421+
)
422+
output["text"] = f"Container build context '{_context}' escapes repository root"
423+
if pull_request and set_check:
424+
await self.check_run_handler.set_check_failure(name=BUILD_CONTAINER_STR, output=output)
425+
return
426+
build_context: str = resolved_context
427+
else:
428+
build_context = worktree_path
429+
399430
build_cmd: str = (
400431
f"--network=host {no_cache} -f "
401432
f"{worktree_path}/{self.github_webhook.dockerfile} "
402-
f"{worktree_path} -t {_container_repository_and_tag}"
433+
f"{build_context} -t {_container_repository_and_tag}"
403434
)
404435

405436
oci_annotation_flags = self._build_oci_annotations(pull_request=pull_request, tag=tag)
@@ -419,17 +450,6 @@ async def run_build_container(
419450
podman_build_cmd: str = f"podman build {build_cmd}"
420451
self.logger.debug(f"{self.log_prefix} Podman build command to run: {podman_build_cmd}")
421452

422-
output: CheckRunOutput = {
423-
"title": "Build container",
424-
"summary": "",
425-
"text": None,
426-
}
427-
if not success:
428-
output["text"] = self.check_run_handler.get_check_run_text(out=out, err=err)
429-
if pull_request and set_check:
430-
await self.check_run_handler.set_check_failure(name=BUILD_CONTAINER_STR, output=output)
431-
return
432-
433453
build_rc, build_out, build_err = await self.run_podman_command(
434454
command=podman_build_cmd,
435455
)

webhook_server/tests/test_check_run_handler.py

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -455,6 +455,59 @@ def test_get_check_run_text_long_length(self, check_run_handler: CheckRunHandler
455455
# Verify the fix: it should end with the code block closer
456456
assert result.endswith("\n```")
457457

458+
def test_get_check_run_text_logs_full_redacted_output_before_truncation(
459+
self, check_run_handler: CheckRunHandler
460+
) -> None:
461+
"""Test that full redacted output is logged via debug before truncation.
462+
463+
The debug log must fire AFTER redaction to avoid leaking secrets to log
464+
aggregation systems, and BEFORE truncation so the complete build output
465+
is preserved in server logs.
466+
"""
467+
# Create text that exceeds 65534 characters to trigger truncation
468+
long_err = "E" * 10000
469+
long_out = "O" * 60000
470+
471+
check_run_handler.get_check_run_text(long_err, long_out)
472+
473+
# Verify debug was called with the full output
474+
debug_calls = [str(call) for call in check_run_handler.logger.debug.call_args_list]
475+
full_output_logged = any(
476+
"Full check run output" in call and "will be truncated" in call for call in debug_calls
477+
)
478+
assert full_output_logged, "Expected debug log with full output before truncation"
479+
480+
def test_get_check_run_text_full_log_has_secrets_redacted(self, check_run_handler: CheckRunHandler) -> None:
481+
"""Test that the full output debug log redacts secrets before logging."""
482+
# Build output that contains secrets and exceeds truncation threshold
483+
# Total (err_clean + out_clean + WRAPPER_OVERHEAD=10) must exceed MAX_LEN=65534
484+
long_out = "O" * 66000
485+
err_with_secret = "Error: auth failed with token test-token and user test-user" # pragma: allowlist secret
486+
487+
check_run_handler.get_check_run_text(err_with_secret, long_out)
488+
489+
# Verify the debug log does NOT contain the raw secrets
490+
debug_calls = [str(call) for call in check_run_handler.logger.debug.call_args_list]
491+
full_log_calls = [c for c in debug_calls if "Full check run output" in c]
492+
assert full_log_calls, "Expected full output debug log"
493+
for call in full_log_calls:
494+
assert "test-token" not in call, "Secret token leaked in debug log"
495+
assert "test-user" not in call, "Username leaked in debug log"
496+
assert "*****" in call, "Expected redacted placeholder in debug log"
497+
498+
def test_get_check_run_text_no_log_when_not_truncated(self, check_run_handler: CheckRunHandler) -> None:
499+
"""Test that no full output debug log is emitted when output fits within limit."""
500+
err = "Short error"
501+
out = "Short output"
502+
503+
check_run_handler.logger.debug.reset_mock()
504+
check_run_handler.get_check_run_text(err, out)
505+
506+
# Debug should NOT contain "Full check run output" for short text
507+
debug_calls = [str(call) for call in check_run_handler.logger.debug.call_args_list]
508+
full_output_logged = any("Full check run output" in call for call in debug_calls)
509+
assert not full_output_logged, "Should not log full output when no truncation needed"
510+
458511
def test_get_check_run_text_token_replacement(self, check_run_handler: CheckRunHandler) -> None:
459512
"""Test that sensitive tokens are replaced in check run text."""
460513
err = "Error with token: test-token"

webhook_server/tests/test_repo_data_from_config.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ def test_repo_data_from_config_repository_found(process_github_webhook):
1919
assert process_github_webhook.container_repository_password == "registry_password" # pragma: allowlist secret
2020
assert process_github_webhook.container_repository == "registry_repository_full_path"
2121
assert process_github_webhook.dockerfile == "Dockerfile"
22+
assert process_github_webhook.container_context == ""
2223
assert process_github_webhook.container_tag == "image_tag"
2324
assert process_github_webhook.container_build_args == ["my-build-arg1=1", "my-build-arg2=2"]
2425
assert process_github_webhook.container_command_args == ["--format docker"]
@@ -29,6 +30,32 @@ def test_repo_data_from_config_repository_found(process_github_webhook):
2930
assert process_github_webhook.minimum_lgtm == 0
3031

3132

33+
def test_container_context_from_config(process_github_webhook) -> None:
34+
"""Test that container context is read from repository config."""
35+
original_get_value = process_github_webhook.config.get_value
36+
37+
def patched_get_value(value: str, *args: Any, **kwargs: Any) -> Any:
38+
if value == "container":
39+
return {
40+
"username": "user",
41+
"password": "pass", # pragma: allowlist secret
42+
"repository": "registry/repo",
43+
"context": "src/app",
44+
}
45+
return original_get_value(value, *args, **kwargs)
46+
47+
with patch.object(process_github_webhook.config, "get_value", side_effect=patched_get_value):
48+
process_github_webhook._repo_data_from_config(repository_config={})
49+
50+
assert process_github_webhook.container_context == "src/app"
51+
52+
53+
def test_container_context_defaults_to_empty_string(process_github_webhook) -> None:
54+
"""Test that container context defaults to empty string when not set."""
55+
process_github_webhook._repo_data_from_config(repository_config={})
56+
assert process_github_webhook.container_context == ""
57+
58+
3259
def test_tox_python_version_nested_no_deprecation_warning(process_github_webhook, caplog):
3360
"""When 'python-version' is set under 'tox', no deprecation warning should be logged."""
3461
with caplog.at_level(python_logging.WARNING):

0 commit comments

Comments
 (0)