Skip to content

Commit 1b82054

Browse files
Sync from github/github-well-architected-internal (main)
Source Repository: github/github-well-architected-internal Source Branch: main Source SHA: 8e66a953725eb52fe03032758f48d4a88b2658e5
1 parent 152d298 commit 1b82054

3 files changed

Lines changed: 353 additions & 80 deletions

File tree

archetypes/default.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
draft: true # Set to false when ready to publish
2+
draft: false # Set to true to keep the page hidden
33
title: 'Insert title here'
44
publishDate: YYYY-MM-DD
55
params:
Lines changed: 274 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,274 @@
1+
---
2+
# SPDX-FileCopyrightText: GitHub and The Project Authors
3+
# SPDX-License-Identifier: MIT
4+
5+
draft: false # Set to false when ready to publish
6+
7+
title: 'Governing agentic workflows with gh-aw and APM'
8+
publishDate: 2026-06-01
9+
params:
10+
authors:
11+
[
12+
{ name: 'Alex De Michieli', handle: 'alexdemichieli' },
13+
]
14+
15+
pillars:
16+
- governance
17+
18+
areas:
19+
- agent-and-extensions
20+
- security
21+
22+
verticals:
23+
- finance
24+
- government
25+
- information-technology
26+
- professional-service
27+
28+
personas:
29+
- administrator
30+
- developer
31+
32+
platform:
33+
- github-enterprise-cloud
34+
- github-enterprise-cloud-plus-emu
35+
36+
features:
37+
- copilot
38+
- github-actions
39+
40+
components:
41+
- coding-agents
42+
- review-agents
43+
- copilot-skill
44+
- governance-and-policy
45+
- push-rulesets
46+
47+
github:
48+
- enterprise-support
49+
- customer-success-architect
50+
- expert-services
51+
---
52+
53+
<!-- markdownlint-disable MD025 -->
54+
<!-- markdownlint-disable MD013 -->
55+
56+
## Scenario overview
57+
58+
Regulated enterprises want AI agents to perform useful work inside their repositories (code review, security scanning, documentation generation) but cannot adopt them without governance guarantees that match what they already require for code dependencies: pinned versions, supply chain scanning, allowlists, sandboxed execution, and auditable runs.
59+
60+
[GitHub Agentic Workflows (gh-aw)](https://github.github.com/gh-aw/) is a framework that lets teams define AI agent automation as markdown files and compile them into deterministic GitHub Actions workflows. The agent runs inside a sandboxed container, interacts with the repository through tools, and produces outputs that are validated before any write action occurs.
61+
62+
The existing [Governing agents in GitHub Enterprise]({{< relref "governing-agents.md" >}}) article covers enterprise Copilot policies, model restrictions, audit logging, and MCP governance. This article covers a complementary layer: how **gh-aw** provides a sandboxed runtime that separates the LLM from write permissions, and how the **[Agent Package Manager (APM)](https://microsoft.github.io/apm/)** adds dependency governance for agent skills and prompts.
63+
64+
Together, these tools address three adoption blockers in regulated environments:
65+
66+
1. **The prompt/skill supply chain is uncontrolled.** Skills are markdown in arbitrary repos. No versioning, no SHA pinning, no allow/deny lists, no scanning for prompt-injection payloads hiding in Unicode.
67+
2. **The agent is an untrusted execution environment.** The LLM consumes attacker-controlled inputs: PR diffs, issue bodies, fetched URLs, and commit messages. Any of these can contain prompt injection payloads that the model may interpret as instructions. If the agent holds write tokens, a successful injection means pushed code or exfiltrated secrets.
68+
3. **No audit or reproducibility story.** Compliance asks "show me exactly what ran, with what permissions, against what input, on what date." Ad-hoc scripts calling the Copilot API typically cannot provide immutable run logs, input and version snapshots, permission context, or execution identity and timestamps in a reviewable format.
69+
70+
The core design principle behind gh-aw is capability separation: **the component that can be tricked (the LLM) holds no write permissions, and the component that can write is deterministic and only acts on validated output.** Skills are pinned to exact SHAs and restricted by org-level policy, and the entire run is a standard Actions workflow with full audit log coverage.
71+
72+
To understand how that separation is enforced at runtime, let's look at the pipeline that gh-aw compiles.
73+
74+
## gh-aw runtime architecture
75+
76+
gh-aw compiles a markdown prompt file into a deterministic GitHub Actions workflow (`.lock.yml`) with a 5-job pipeline that enforces capability separation:
77+
78+
```mermaid
79+
flowchart TD
80+
A[activation] --> B[agent]
81+
B --> C[detection]
82+
C --> D[safe_outputs]
83+
D --> E[conclusion]
84+
85+
style A fill:#e8f4fd,stroke:#0969da
86+
style B fill:#fff3cd,stroke:#9a6700
87+
style C fill:#fde8e8,stroke:#cf222e
88+
style D fill:#e6ffec,stroke:#1a7f37
89+
style E fill:#f6f8fa,stroke:#656d76
90+
```
91+
92+
| Job | Purpose | Permissions |
93+
|-----|---------|-------------|
94+
| **activation** | Validates lockfile integrity (hash check), confirms user membership (write access required), validates auth credentials | Read-only |
95+
| **agent** | Runs the LLM inside an Agentic Workflows Firewall (AWF) sandbox (Docker container) with network firewall. Reads repo, calls Copilot API for inference, executes tools. All outbound traffic filtered to an allowlist. | Read-only, no write tokens |
96+
| **detection** | Inspects agent output for policy violations. Security gate between the agent and write actions. | Read-only |
97+
| **safe_outputs** | Receives validated agent output. Posts comments, creates PRs, updates issues. Permissions scoped per output type. | Scoped write |
98+
| **conclusion** | Final status reporting and cleanup. | Read-only |
99+
100+
The activation job is the first thing that runs, and before it lets the agent do anything, it needs to verify who is calling and how they authenticated.
101+
102+
### Authentication modes
103+
104+
gh-aw supports two auth paths:
105+
106+
1. **PAT path:** A fine-grained personal access token with Copilot permissions, stored as a `COPILOT_GITHUB_TOKEN` secret. Straightforward to set up, works across environments.
107+
2. **`copilot-requests` path:** Uses the Actions-provided `GITHUB_TOKEN` (`github.token`) with a `copilot-requests: write` workflow permission. Ephemeral, no stored secret, scoped to the workflow run.
108+
109+
The `copilot-requests` path is preferred for production because it eliminates long-lived secrets and ties billing to the org rather than an individual.
110+
111+
Once authenticated, the agent job starts inside a sandboxed container. The next layer of protection is the network firewall that controls what the agent can reach.
112+
113+
### AWF sandbox network firewall
114+
115+
The agent job runs inside a Docker container with a strict outbound allowlist:
116+
117+
- Copilot API (inference endpoint)
118+
- github.com (repo content)
119+
- Package registries (npm, PyPI, Ubuntu mirrors)
120+
- SSL/OCSP endpoints
121+
122+
This allowlist is compiled into the lockfile by gh-aw. Custom allowlist entries are not yet supported (see [Known gaps](#known-gaps-and-roadmap)).
123+
124+
The runtime architecture handles isolation and sandboxing, but it does not answer a different question: how do you control which skills the agent is allowed to use in the first place? That is where APM comes in.
125+
126+
## APM governance layer
127+
128+
APM treats agent skills as packages with the same governance primitives that enterprises require for code dependencies:
129+
130+
### Dependency pinning and content scanning
131+
132+
```yaml
133+
# apm.lock.yaml
134+
packages:
135+
- name: org/security-review-skill
136+
version: 1.2.0
137+
commit: a8f3b2c...
138+
integrity: sha256-...
139+
```
140+
141+
Every skill is pinned to an exact commit SHA, and `apm install` generates a lockfile that makes builds reproducible. There is no drift between what was reviewed and what actually runs. At install time, APM also scans skill content for hidden Unicode threats like homoglyphs, bidirectional override characters, and zero-width joiners that could inject invisible instructions into agent prompts.
142+
143+
Pinning and scanning protect individual repos, but most regulated enterprises also need a way to control which skills are available across the entire org.
144+
145+
### Org-level policy (`apm-policy.yml`)
146+
147+
Placed in the org's `.github` repository, this policy file controls which packages any repo in the org can consume:
148+
149+
```yaml
150+
# .github/apm-policy.yml
151+
name: "Org agent governance"
152+
version: "1.0.0"
153+
enforcement: block
154+
dependencies:
155+
allow:
156+
- "org/approved-security-skills/*"
157+
- "org/approved-review-skills/*"
158+
deny:
159+
- "*" # deny everything not explicitly allowed
160+
require_pinned_constraint: true
161+
```
162+
163+
Policy inheritance works at three levels: enterprise, org, repo. Inheritance is tighten-only: child policies can narrow allowlists, add deny entries, and escalate enforcement, but they cannot relax constraints set by a parent.
164+
165+
Some enterprises go further and require that all package downloads route through a corporate scanning proxy rather than pulling directly from GitHub.
166+
167+
### Artifactory proxy for air-gapped environments
168+
169+
For enterprises that require all dependencies to route through a corporate scanning pipeline:
170+
171+
```bash
172+
export PROXY_REGISTRY_URL=https://artifactory.example.com/artifactory/github
173+
export PROXY_REGISTRY_TOKEN=<token>
174+
export PROXY_REGISTRY_ONLY=1 # blocks direct GitHub access
175+
```
176+
177+
With `PROXY_REGISTRY_ONLY=1`, APM refuses to download packages directly from GitHub. Everything routes through the proxy. The lockfile records the proxy host for reproducibility.
178+
179+
So far, APM controls which skills a repo can use and how they are fetched. But there is one more attack surface: even an approved skill can be undermined if repo-level configuration can override its instructions at runtime.
180+
181+
## The isolation pattern
182+
183+
When APM imports a skill in **isolated mode**, the agent sees only the skill's packaged instructions. Without isolation, the agent's system prompt is assembled from the skill's instructions plus any repo-level config files (`.github/copilot-instructions.md`, `AGENTS.md`, custom instructions). That means a compromised or misconfigured repo-level config could override security review criteria, inject additional tool calls, or modify the skill's persona definitions without the skill publisher's knowledge.
184+
185+
Isolated mode closes that door. The skill publisher controls what the agent is told to do, and the org policy controls which skills are allowed.
186+
187+
In a gh-aw workflow file:
188+
189+
```yaml
190+
imports:
191+
- uses: shared/apm.md
192+
with:
193+
packages:
194+
- org/security-review-skill@1.2.0
195+
isolated: true
196+
```
197+
198+
The combination of isolation + lockfile + org policy gives platform teams a complete answer to "how do we know what instructions the agent followed?"
199+
200+
That said, gh-aw and APM are still maturing. There are a few areas where the governance story has gaps that platform teams should be aware of.
201+
202+
## Known gaps and roadmap
203+
204+
Three capabilities are not yet fully available:
205+
206+
1. **Sandbox allowlist customization.** The AWF container's network firewall allowlist is compiled by gh-aw and cannot be extended by the consumer. Enterprises that need agents to reach internal APIs (e.g., internal documentation servers, proprietary scanning tools) currently have no path. This has been surfaced to the gh-aw team.
207+
208+
2. **Centralized runner group enforcement.** Organizations cannot yet mandate that all agentic workflows run on a specific runner group (e.g., hardened, isolated runners in a specific region). Individual workflows can specify `runs-on`, but there is no org-level override. This has been surfaced to the gh-aw team.
209+
210+
3. **Self-hosted runner support (ARC with Docker-in-Docker).** Support for Actions Runner Controller with Docker-in-Docker sidecar configurations is actively stabilizing. The AWF sandbox relies on bind mounts and container networking that behave differently when the Docker daemon runs in a separate filesystem from the runner. Platform teams using ARC/DinD should validate their runner architecture against current gh-aw pre-releases before production rollout.
211+
212+
All three are expected to be addressable as gh-aw matures.
213+
214+
With those caveats in mind, here is a practical checklist for teams ready to adopt this pattern today.
215+
216+
## Checklist for platform teams
217+
218+
### Prerequisites
219+
220+
- [ ] GitHub Enterprise Cloud (EMU recommended for regulated environments)
221+
- [ ] GitHub Actions enabled with appropriate runner infrastructure
222+
- [ ] [APM CLI installed](https://microsoft.github.io/apm/quickstart/) on developer machines and CI
223+
- [ ] [gh-aw extension installed](https://github.com/github/gh-aw) (`gh extension install github/gh-aw`)
224+
225+
### Governance setup
226+
227+
- [ ] `apm-policy.yml` published in the org's `.github` repository with allowlist of approved skill packages
228+
- [ ] Skill packages hosted in internal or org-owned repositories with branch protection
229+
- [ ] Lockfiles (`apm.lock.yaml`) committed to consuming repos and protected by rulesets
230+
- [ ] Artifactory proxy configured if corporate scanning pipeline is required (`PROXY_REGISTRY_ONLY=1`)
231+
- [ ] Content scanning enabled for all skill package installations
232+
233+
### Runtime security
234+
235+
- [ ] gh-aw workflows compiled and lockfiles validated (hash integrity check in activation job)
236+
- [ ] Skills imported in isolated mode to prevent repo-level config contamination
237+
- [ ] Auth mode selected: `copilot-requests` (preferred) or PAT with rotation policy
238+
- [ ] PAT secrets (if used) scoped to minimum required permissions
239+
- [ ] Runner groups designated for agentic workloads (manual `runs-on` until org-level enforcement ships)
240+
241+
### Observability
242+
243+
- [ ] Audit log streaming enabled, capturing agentic workflow events
244+
- [ ] Workflow run retention configured for compliance requirements
245+
- [ ] Detection job output monitored for policy violation patterns
246+
- [ ] Alerting configured for failed activation jobs (lockfile tampering indicator)
247+
248+
## References
249+
250+
- [APM documentation](https://microsoft.github.io/apm/)
251+
- [APM Governance Guide](https://microsoft.github.io/apm/enterprise/governance-guide/) (detailed policy spec, enforcement points, bypass contract)
252+
- [APM + gh-aw integration guide](https://microsoft.github.io/apm/integrations/gh-aw/) (the `shared/apm.md` import pattern)
253+
- [APM review panel reference implementation](https://github.com/microsoft/apm/tree/main/.github/skills/apm-review-panel) (credit: Daniel Meppiel)
254+
- [gh-aw documentation](https://github.github.com/gh-aw/)
255+
256+
## Seeking further assistance
257+
258+
{{% seeking-further-assistance-details %}}
259+
260+
## Related links
261+
262+
{{% related-links-github-docs %}}
263+
264+
### Related articles in this framework
265+
266+
- [Governing agents in GitHub Enterprise]({{< relref "governing-agents.md" >}}) — enterprise Copilot policies, model restrictions, audit logging, and MCP governance
267+
- [GitHub Enterprise Policies & Best Practices]({{< relref "governance-policies-best-practices" >}}) — rulesets, CODEOWNERS, commit signing, audit log streaming
268+
- [Application Security Checklist]({{< relref "/library/application-security/checklist" >}}) — foundational code security practices that agent governance builds on
269+
270+
### External resources
271+
272+
- [APM documentation](https://microsoft.github.io/apm/)
273+
- [APM Governance Guide](https://microsoft.github.io/apm/enterprise/governance-guide/)
274+
- [gh-aw documentation](https://github.github.com/gh-aw/)

0 commit comments

Comments
 (0)