feat: implement session-based tracking for malicious behavior by Jazzcort · Pull Request #430 · rhel-lightspeed/linux-mcp-server

Jazzcort · 2026-04-23T18:58:36Z

Adds BehaviorRecord and BehaviorRecordManager to monitor gatekeeper verdicts per session. If a session accumulates 10 total or 3 consecutive malicious actions, it is permanently flagged as malicious. Once flagged, all subsequent scripts executed in that session will be forced to require user confirmation, regardless of the script's default requirements.

github-actions · 2026-04-23T18:58:44Z

For team members: test commit 5439742 in internal GitLab

codecov · 2026-04-23T19:01:58Z

Codecov Report

❌ Patch coverage is 84.90566% with 8 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/linux_mcp_server/tools/run_script.py	84.90%	4 Missing and 4 partials ⚠️

Flag	Coverage Δ
unittests	`97.10% <84.90%> (-0.10%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/linux_mcp_server/tools/run_script.py	`93.72% <84.90%> (-2.60%)`	⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

owtaylor

Hmm, not sure this is going to be effective:

A malicious model can easily evade a "consecutive malicious scripts" checks - also a malicious script for readonly that modifies the system or installs things from the internet is currently often tagged as MODIFIES_SYSTEM or POLICY, so a malicious
10 tries is a lot, by then the game might be up.

Maybe:

Any MALICIOUS script triggers confirmation of the next run_script (but not the whole session)
For the mcp-apps case, we display a boxed warning in that confirmation:

Suspicious activity detected - your chat client may be under attack. Please examine previous tool calls in detail, and if you have any doubts, do not approve this command and terminate this chat session.

There's not much we can do about the non-mcp-apps case.

owtaylor · 2026-05-18T21:24:26Z

+    A session is flagged as malicious if it accumulates 10+ total malicious actions
+    or MAX_CONSECUTIVE_MALICIOUS_ACTIONS consecutive malicious actions.


A little odd to have one a constant and the other not

Got it! I'll have a MAX_TOTAL_MALICIOUS_ACTIONS for that. What would be the suitable count for it? 4 maybe? 🤔

owtaylor · 2026-05-19T00:44:05Z

+    """
+
+    def __init__(self):
+        self._recent_record: deque[GatekeeperStatus] = deque(maxlen=MAX_CONSECUTIVE_MALICIOUS_ACTIONS)


deque seems unnecessary - can you just have _consecutive_malicious_action_acounts and reset it to zero if you get some other status.

You're right! It can be simplified 😁

github-actions · 2026-05-21T19:29:02Z

For team members: test commit 0951ca0 in internal GitLab

owtaylor · 2026-05-21T19:56:56Z

+
+    @property
+    def is_previous_action_malicious(self) -> bool:
+        return self._previous_action_status == GatekeeperStatus.MALICIOUS


I think this should not be cleared except by an approval - that is:

validate_script => MALICIOUS - sets flag validate_script => OK => flag stays set execute_script => flag is cleared

Make sense! Will change!

owtaylor · 2026-05-21T19:59:19Z

+
+    @property
+    def is_security_compromised(self) -> bool:
+        return self._is_security_compromised


If the model is security compromised, then we have no business allowing the user to execute anything.

What if we unified the two flags into "malicious_activity_warning" . which gets set transiently on a single MALICIOUS and permanently passed the thresholds.

This differs from your setup in that we'd display the scary warning for a single malicious activity detection, but I think that's probably appropriate. An attacker only has to get through once.

Sounds great! I'm always bad at naming these. Will change!

github-actions · 2026-05-25T15:33:29Z

For team members: test commit cf62e35 in internal GitLab

Add BehaviorRecord and BehaviorRecordManager to track gatekeeper verdicts per session. A single MALICIOUS verdict sets a temporary warning that forces human confirmation on the next script; the warning clears once the human approves. If a session accumulates 4 total or 3 consecutive malicious verdicts, it is permanently flagged and all subsequent scripts require confirmation. Integrate behavior tracking into validate_script, run_script, run_script_interactive, run_script_with_confirmation, and execute_script. Emit malicious_activity_warning through the RunScriptInteractiveResult for the mcp-app UI. Add documentation for session-based behavior tracking in guarded-command-execution.md. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-05-26T13:24:10Z

For team members: test commit bf4a203 in internal GitLab

owtaylor requested changes May 19, 2026

View reviewed changes

Jazzcort force-pushed the hardening-security-for-malicious-attemps branch from 5439742 to 0951ca0 Compare May 21, 2026 19:28

Jazzcort marked this pull request as ready for review May 21, 2026 19:47

Jazzcort requested a review from a team as a code owner May 21, 2026 19:47

owtaylor requested changes May 21, 2026

View reviewed changes

Jazzcort force-pushed the hardening-security-for-malicious-attemps branch from 0951ca0 to cf62e35 Compare May 25, 2026 15:33

Jazzcort force-pushed the hardening-security-for-malicious-attemps branch from cf62e35 to bf4a203 Compare May 26, 2026 13:23

Jazzcort requested a review from owtaylor May 26, 2026 13:31

		A session is flagged as malicious if it accumulates 10+ total malicious actions
		or MAX_CONSECUTIVE_MALICIOUS_ACTIONS consecutive malicious actions.

Conversation

Jazzcort commented Apr 23, 2026

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

codecov Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

owtaylor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented Apr 23, 2026 •

edited

Loading