Skip to content

Commit 1d68943

Browse files
authored
feat: add trusted-committers allowlist for security-committer-identity check (#1117)
## Problem Bot cherry-picks and pre-commit CI auto-fixes trigger false committer identity mismatch failures. For example, `pre-commit-ci[bot]` pushing a formatting fix to a PR authored by `legit-user` fails the `security-committer-identity` check, even though this is expected behavior. ## Solution Adds a `trusted-committers` config option under `security-checks` that allowlists committer logins for the identity check. When the last committer is in the list, the check passes with a success message instead of failing. ### Features - Case-insensitive comparison (config entries normalized to lowercase) - Element sanitization (strips whitespace, filters invalid types) - Warning log when config value is wrong type - Security ordering: "unknown" committer check always runs first, even if "unknown" is in the trusted list - Works at both global and per-repository level ### Config example ```yaml security-checks: committer-identity-check: true trusted-committers: - "pre-commit-ci[bot]" - "myorg" ``` ## Changes - `webhook_server/config/schema.yaml` — added `trusted-committers` array to `security-checks` - `webhook_server/libs/github_api.py` — reads and sanitizes `trusted-committers` from config - `webhook_server/libs/handlers/runner_handler.py` — checks allowlist before failing on mismatch - `webhook_server/tests/test_security_checks.py` — 2 new tests (trusted pass + untrusted fail), updated all fixtures - `examples/config.yaml` — added example Closes #1116 Assisted-by: Claude <noreply@anthropic.com> --------- Signed-off-by: rnetser <rnetser@redhat.com> Signed-off-by: Ruth Netser <rnetser@redhat.com>
1 parent 993f795 commit 1d68943

10 files changed

Lines changed: 256 additions & 109 deletions

File tree

examples/config.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,8 @@ security-checks:
150150
- ".github/workflows/"
151151
- ".github/actions/"
152152
committer-identity-check: true
153+
trusted-committers: # Additional trusted committers (GitHub App bot, web-flow, and API users are auto-included)
154+
- "pre-commit-ci[bot]"
153155

154156
repositories:
155157
my-repository:

webhook_server/config/schema.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,16 @@ $defs:
9595
When enabled, compares the PR author (parent committer) against the last commit's
9696
committer. Fails the security-committer-identity check run if they differ.
9797
default: true
98+
trusted-committers:
99+
type: array
100+
description: |
101+
List of committer logins that are always trusted for the committer-identity check.
102+
When the last committer matches an entry in this list, the check passes regardless
103+
of the PR author. Useful for bots (e.g., pre-commit-ci[bot]) and org identities.
104+
Note: The GitHub App bot, web-flow, and github-tokens API users are
105+
automatically included — you only need to list additional external committers here.
106+
items:
107+
type: string
98108
additionalProperties: false
99109
type: object
100110
properties:

webhook_server/docs/pr-flow-api.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -243,12 +243,14 @@ The PR Flow API integrates seamlessly with the log viewer system to provide:
243243
```python
244244
import httpx
245245

246+
246247
async def analyze_pr_workflow(hook_id: str, base_url: str = "http://192.168.10.44:5003") -> dict:
247248
async with httpx.AsyncClient() as client:
248249
# Example using production webhook server
249250
response = await client.get(f"{base_url}/logs/api/pr-flow/{hook_id}")
250251
return response.json()
251252

253+
252254
# Usage
253255
workflow_data = await analyze_pr_workflow("f4b3c2d1-a9b8-4c5d-9e8f-1a2b3c4d5e6f")
254256
print(f"Workflow health: {workflow_data['performance_metrics']['workflow_health']}")

webhook_server/libs/github_api.py

Lines changed: 62 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@
3636
CONFIGURABLE_LABEL_CATEGORIES,
3737
CONVENTIONAL_TITLE_STR,
3838
DEFAULT_SUSPICIOUS_PATHS,
39+
GITHUB_WEB_FLOW_LOGIN,
3940
OTHER_MAIN_BRANCH,
4041
PRE_COMMIT_STR,
4142
PYTHON_MODULE_INSTALL_STR,
@@ -504,7 +505,7 @@ async def _recheck_merge_eligibility(self, pull_request: PullRequest) -> None:
504505

505506
async def process(self) -> Any:
506507
# Early exit for pull_request_review_thread events that don't need processing.
507-
# Must run BEFORE add_api_users_to_auto_verified_and_merged_users() to avoid
508+
# Must run BEFORE get_api_users() to avoid
508509
# burning rate limit on get_user() calls for skipped events.
509510
if self.github_event == "pull_request_review_thread":
510511
action = self.hook_data["action"]
@@ -526,7 +527,7 @@ async def process(self) -> Any:
526527

527528
# Early exit for status events with pending state — only terminal states
528529
# (success, failure, error) need processing.
529-
# Must run BEFORE add_api_users_to_auto_verified_and_merged_users() to avoid
530+
# Must run BEFORE get_api_users() to avoid
530531
# burning rate limit on get_user() calls for skipped events.
531532
if self.github_event == "status":
532533
state = self.hook_data["state"]
@@ -542,7 +543,8 @@ async def process(self) -> Any:
542543
return None
543544

544545
# Initialize auto-verified users from API users (async operation)
545-
await self.add_api_users_to_auto_verified_and_merged_users()
546+
api_users = await self.get_api_users()
547+
self.auto_verified_and_merged_users.extend(user for user in api_users if user is not None)
546548

547549
# Initialize app bot login for bot-PR identification (async)
548550
if not self.app_bot_login:
@@ -569,6 +571,9 @@ async def process(self) -> Any:
569571
else:
570572
self.logger.debug(f"{self.log_prefix} No GitHub App API available — app_bot_login not set")
571573

574+
# Build final trusted-committers list with dynamic identities
575+
await self._build_trusted_committers(api_users=api_users)
576+
572577
event_log: str = f"Event type: {self.github_event}. event ID: {self.x_github_delivery}"
573578

574579
# Start webhook routing context step
@@ -668,7 +673,7 @@ async def process(self) -> Any:
668673
self.last_commit = await self._get_last_commit(pull_request=pull_request)
669674
self.parent_committer = pull_request.user.login
670675
self.last_committer = await github_api_call(
671-
lambda: getattr(self.last_commit.committer, "login", "unknown"),
676+
lambda: getattr(self.last_commit.committer, "login", "unknown") or "unknown",
672677
logger=self.logger,
673678
log_prefix=self.log_prefix,
674679
)
@@ -835,27 +840,18 @@ async def process(self) -> Any:
835840
await self._update_context_metrics()
836841
return None
837842

838-
async def add_api_users_to_auto_verified_and_merged_users(self) -> None:
843+
async def get_api_users(self) -> list[str | None]:
839844
apis_and_tokens = get_apis_and_tokes_from_config(config=self.config)
840845

841846
async def check_token(api: github.Github, token: str) -> str | None:
842847
"""Check a single API token and return the user login if valid, None otherwise."""
843848
token_suffix = f"...{token[-4:]}" if token else "unknown"
844849
try:
845-
rate_limit_remaining = await github_api_call(
846-
lambda: api.rate_limiting[-1], logger=self.logger, log_prefix=self.log_prefix
847-
)
848-
except Exception as ex:
849-
self.logger.warning(
850-
f"{self.log_prefix} Failed to get API rate limit for token ending in '{token_suffix}', "
851-
f"skipping. {ex}"
852-
)
853-
return None
854-
855-
if rate_limit_remaining == 60:
856-
self.logger.warning(
857-
f"{self.log_prefix} API has rate limit set to 60 which indicates an invalid token "
858-
f"(token ending in '{token_suffix}'), skipping"
850+
# Pre-flight probe: verify token is functional before attempting get_user()
851+
await github_api_call(lambda: api.rate_limiting[-1], logger=self.logger, log_prefix=self.log_prefix)
852+
except Exception:
853+
self.logger.exception(
854+
f"{self.log_prefix} Failed to get API rate limit for token ending in '{token_suffix}', skipping"
859855
)
860856
return None
861857

@@ -871,8 +867,7 @@ async def check_token(api: github.Github, token: str) -> str | None:
871867

872868
return _api_user
873869

874-
results = await asyncio.gather(*[check_token(api, token) for api, token in apis_and_tokens])
875-
self.auto_verified_and_merged_users.extend(user for user in results if user is not None)
870+
return await asyncio.gather(*[check_token(api, token) for api, token in apis_and_tokens])
876871

877872
def prepare_log_prefix(self, pull_request: PullRequest | None = None) -> str:
878873
return prepare_log_prefix(
@@ -997,6 +992,19 @@ def _repo_data_from_config(self, repository_config: dict[str, Any]) -> None:
997992
_mandatory_raw = True
998993
self.security_mandatory: bool = _mandatory_raw
999994

995+
_trusted_committers = _security_config.get("trusted-committers", [])
996+
if not isinstance(_trusted_committers, list):
997+
self.logger.warning(
998+
f"{self.log_prefix} security-checks.trusted-committers must be array, "
999+
f"got {type(_trusted_committers).__name__}. Defaulting to empty list."
1000+
)
1001+
_trusted_committers = []
1002+
self.security_trusted_committers: list[str] = [
1003+
str(entry).strip().lower()
1004+
for entry in _trusted_committers
1005+
if isinstance(entry, (str, int, float)) and not isinstance(entry, bool) and str(entry).strip()
1006+
]
1007+
10001008
_auto_merge_prs = self.config.get_value(
10011009
value="set-auto-merge-prs", return_on_none=[], extra_dict=repository_config
10021010
)
@@ -1110,6 +1118,39 @@ def _sanitize_item(item: Any) -> str:
11101118

11111119
self.mask_sensitive = self.config.get_value("mask-sensitive-data", return_on_none=True)
11121120

1121+
async def _build_trusted_committers(self, api_users: list[str | None] | None = None) -> None:
1122+
"""Add dynamic entries to trusted-committers list.
1123+
1124+
Adds:
1125+
- GitHub App bot login (app_bot_login)
1126+
- GitHub's web-flow system account
1127+
- API user logins from github-tokens
1128+
1129+
Static config entries are already loaded in _repo_data_from_config().
1130+
Called once from process() after app_bot_login is initialized.
1131+
"""
1132+
# GitHub App bot
1133+
if self.app_bot_login:
1134+
bot = str(self.app_bot_login).strip().lower()
1135+
if bot and bot not in self.security_trusted_committers:
1136+
self.security_trusted_committers.append(bot)
1137+
1138+
# GitHub's web-flow system account
1139+
web_flow = GITHUB_WEB_FLOW_LOGIN.lower()
1140+
if web_flow not in self.security_trusted_committers:
1141+
self.security_trusted_committers.append(web_flow)
1142+
1143+
# API users from github-tokens
1144+
if api_users is None:
1145+
api_users = await self.get_api_users()
1146+
for user in api_users:
1147+
if user:
1148+
normalized = str(user).strip().lower()
1149+
if normalized and normalized not in self.security_trusted_committers:
1150+
self.security_trusted_committers.append(normalized)
1151+
1152+
self.logger.debug(f"{self.log_prefix} Trusted committers: {self.security_trusted_committers}")
1153+
11131154
async def get_pull_request(self, number: int | None = None) -> PullRequest | None:
11141155
if number:
11151156
self.logger.debug(f"{self.log_prefix} Attempting to get PR by number: {number}")

webhook_server/libs/handlers/runner_handler.py

Lines changed: 60 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -372,8 +372,8 @@ async def run_security_suspicious_paths(self) -> None:
372372
async def run_security_committer_identity(self) -> None:
373373
"""Check if the last committer matches the PR author.
374374
375-
Fails the check run if the last commit's committer differs from the PR author
376-
(parent_committer), which may indicate a commit was authored by someone unexpected.
375+
Uses a unified trusted-committers list that includes static config entries,
376+
the GitHub App bot login, web-flow, and API users from github-tokens.
377377
"""
378378
if not self.github_webhook.security_committer_identity_check:
379379
self.logger.debug(f"{self.log_prefix} Committer identity check disabled, skipping")
@@ -385,15 +385,17 @@ async def run_security_committer_identity(self) -> None:
385385
parent_committer = self.github_webhook.parent_committer
386386
last_committer = self.github_webhook.last_committer
387387

388+
# SECURITY: "unknown" check MUST precede the trusted-committers check.
389+
# An unverifiable committer identity should always fail.
388390
if last_committer == "unknown":
389391
output: CheckRunOutput = {
390-
"title": "\u274c Security: Committer Identity Unknown",
392+
"title": " Security: Committer Identity Unknown",
391393
"summary": "Committer identity could not be verified",
392394
"text": (
393395
"## Committer Identity Check\n\n"
394396
f"**PR author:** `{parent_committer}`\n"
395397
"**Last commit committer:** unknown\n\n"
396-
"Committer identity could not be verified \u2014 "
398+
"Committer identity could not be verified "
397399
"last commit has no associated GitHub user.\n\n"
398400
"This may indicate:\n"
399401
"- A commit was made with a local Git identity not linked to a GitHub account\n"
@@ -406,68 +408,78 @@ async def run_security_committer_identity(self) -> None:
406408
f"PR author={parent_committer}, last committer has no GitHub user"
407409
)
408410
await self.check_run_handler.set_check_failure(name=SECURITY_COMMITTER_IDENTITY_STR, output=output)
409-
elif last_committer == GITHUB_WEB_FLOW_LOGIN:
410-
last_committer_id = self.github_webhook.last_committer_id
411-
if last_committer_id == GITHUB_WEB_FLOW_USER_ID:
412-
self.logger.debug(
413-
f"{self.log_prefix} Last committer is GitHub's web-flow "
414-
f"(user ID {last_committer_id}) — passing committer identity check"
411+
412+
elif last_committer.lower() != parent_committer.lower():
413+
if last_committer.lower() in self.github_webhook.security_trusted_committers:
414+
# Extra guard: verify web-flow by immutable user ID to prevent impersonation
415+
if last_committer.lower() == GITHUB_WEB_FLOW_LOGIN:
416+
last_committer_id = self.github_webhook.last_committer_id
417+
if last_committer_id != GITHUB_WEB_FLOW_USER_ID:
418+
self.logger.warning(
419+
f"{self.log_prefix} Committer login is 'web-flow' but user ID "
420+
f"{last_committer_id} does not match GitHub's web-flow ID "
421+
f"{GITHUB_WEB_FLOW_USER_ID} — possible impersonation"
422+
)
423+
output = {
424+
"title": "❌ Security: Committer Identity Suspicious",
425+
"summary": (
426+
f"Committer claims to be web-flow but has unexpected user ID {last_committer_id}"
427+
),
428+
"text": (
429+
f"## Committer Identity Check\n\n"
430+
f"**PR author:** `{parent_committer}`\n"
431+
f"**Last commit committer:** `{last_committer}` (ID: {last_committer_id})\n"
432+
f"**Expected web-flow ID:** {GITHUB_WEB_FLOW_USER_ID}\n\n"
433+
f"The committer login is `web-flow` but the user ID does not match "
434+
f"GitHub's official web-flow account. This may indicate an impersonation attempt."
435+
),
436+
}
437+
await self.check_run_handler.set_check_failure(
438+
name=SECURITY_COMMITTER_IDENTITY_STR, output=output
439+
)
440+
return
441+
442+
# Trusted committer — pass
443+
self.logger.info(
444+
f"{self.log_prefix} Committer identity: '{last_committer}' is in unified trusted list"
415445
)
416446
output = {
417447
"title": "Security: Committer Identity",
418-
"summary": "Committer identity verified (GitHub web-flow)",
448+
"summary": f"Committer '{last_committer}' is trusted",
419449
"text": (
420450
f"## Committer Identity Check\n\n"
421451
f"**PR author:** `{parent_committer}`\n"
422-
f"**Last commit committer:** `{last_committer}` (ID: {last_committer_id})\n\n"
423-
f"The last commit was made via the GitHub web UI (rebase, merge, or edit). "
424-
f"The committer is GitHub's verified `web-flow` system account "
425-
f"(user ID {GITHUB_WEB_FLOW_USER_ID}), confirming this is a legitimate "
426-
f"GitHub web operation."
452+
f"**Last commit committer:** `{last_committer}`\n\n"
453+
f"The committer differs from the PR author but is in the trusted committers list.\n"
454+
f"This is expected for automated workflows (bots, CI tools, org identities, "
455+
f"GitHub web operations)."
427456
),
428457
}
429458
await self.check_run_handler.set_check_success(name=SECURITY_COMMITTER_IDENTITY_STR, output=output)
430459
else:
431-
self.logger.warning(
432-
f"{self.log_prefix} Committer login is 'web-flow' but user ID "
433-
f"{last_committer_id} does not match GitHub's web-flow ID "
434-
f"{GITHUB_WEB_FLOW_USER_ID} — possible impersonation"
435-
)
460+
# Untrusted mismatch — fail
436461
output = {
437-
"title": "\u274c Security: Committer Identity Suspicious",
438-
"summary": f"Committer claims to be web-flow but has unexpected user ID {last_committer_id}",
462+
"title": " Security: Committer Identity Mismatch",
463+
"summary": f"Last committer '{last_committer}' differs from PR author '{parent_committer}'",
439464
"text": (
440465
f"## Committer Identity Check\n\n"
441466
f"**PR author:** `{parent_committer}`\n"
442-
f"**Last commit committer:** `{last_committer}` (ID: {last_committer_id})\n"
443-
f"**Expected web-flow ID:** {GITHUB_WEB_FLOW_USER_ID}\n\n"
444-
f"The committer login is `web-flow` but the user ID does not match "
445-
f"GitHub's official web-flow account. This may indicate an impersonation attempt."
467+
f"**Last commit committer:** `{last_committer}`\n\n"
468+
f"The last commit in this PR was made by a different user than the PR author. "
469+
f"This may indicate:\n"
470+
f"- An unauthorized commit was pushed to the PR branch\n"
471+
f"- A bot or automation tool committed with unexpected credentials\n"
472+
f"- A legitimate co-author contribution (review carefully)\n\n"
473+
f"Please verify this is expected before merging."
446474
),
447475
}
476+
self.logger.warning(
477+
f"{self.log_prefix} Committer identity mismatch: "
478+
f"PR author={parent_committer}, last committer={last_committer}"
479+
)
448480
await self.check_run_handler.set_check_failure(name=SECURITY_COMMITTER_IDENTITY_STR, output=output)
449-
elif last_committer != parent_committer:
450-
output = {
451-
"title": "\u274c Security: Committer Identity Mismatch",
452-
"summary": f"Last committer '{last_committer}' differs from PR author '{parent_committer}'",
453-
"text": (
454-
f"## Committer Identity Check\n\n"
455-
f"**PR author:** `{parent_committer}`\n"
456-
f"**Last commit committer:** `{last_committer}`\n\n"
457-
f"The last commit in this PR was made by a different user than the PR author. "
458-
f"This may indicate:\n"
459-
f"- An unauthorized commit was pushed to the PR branch\n"
460-
f"- A bot or automation tool committed with unexpected credentials\n"
461-
f"- A legitimate co-author contribution (review carefully)\n\n"
462-
f"Please verify this is expected before merging."
463-
),
464-
}
465-
self.logger.warning(
466-
f"{self.log_prefix} Committer identity mismatch: "
467-
f"PR author={parent_committer}, last committer={last_committer}"
468-
)
469-
await self.check_run_handler.set_check_failure(name=SECURITY_COMMITTER_IDENTITY_STR, output=output)
470481
else:
482+
# Match — pass
471483
output = {
472484
"title": "Security: Committer Identity",
473485
"summary": "Committer identity verified",

0 commit comments

Comments
 (0)