Skip to content

Commit f55ca47

Browse files
Copilotdavidslater
andauthored
Resolve merge conflicts with main (#34)
* Initial plan * Add structured reflect triage support Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/affa5959-e553-45b8-96e7-f804589f3059 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Document two-phase reflect detection Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/affa5959-e553-45b8-96e7-f804589f3059 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Default triage to local reflect endpoint Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/affa5959-e553-45b8-96e7-f804589f3059 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Address validation feedback for structured retries Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/affa5959-e553-45b8-96e7-f804589f3059 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Polish validation feedback handling Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/affa5959-e553-45b8-96e7-f804589f3059 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Unify correction prompt helpers Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/affa5959-e553-45b8-96e7-f804589f3059 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Make reflect retry settings clearer Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/affa5959-e553-45b8-96e7-f804589f3059 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Resolve reflect review comments Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/d9bcbb5d-ff2a-436a-9cee-3328587c44e1 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Refine triage truncation and retry wording Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/d9bcbb5d-ff2a-436a-9cee-3328587c44e1 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Address final validation suggestions Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/d9bcbb5d-ff2a-436a-9cee-3328587c44e1 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Clarify final review safeguards Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/d9bcbb5d-ff2a-436a-9cee-3328587c44e1 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Document response parsing safeguards Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/d9bcbb5d-ff2a-436a-9cee-3328587c44e1 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Resolve remaining reflect review comments Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/14c8c39e-57ca-4f32-8ff3-20c067920367 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Apply validation readability follow-ups Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/14c8c39e-57ca-4f32-8ff3-20c067920367 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Apply final validation polish Agent-Logs-Url: https://github.com/github/gh-aw-threat-detection/sessions/14c8c39e-57ca-4f32-8ff3-20c067920367 Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> * Add reflect fallback tests --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com> Co-authored-by: David Slater <davidslater@users.noreply.github.com>
1 parent 1cd3823 commit f55ca47

15 files changed

Lines changed: 1402 additions & 32 deletions

README.md

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,8 +59,22 @@ threat-detect [flags] <artifacts-dir>
5959
- `--model` — Model override for the engine
6060
- `--prompt-template` — Path to custom prompt template
6161
- `--output` — Path to write JSON result (defaults to stdout)
62+
- `--triage` — Enable fast Phase 1 structured-output triage. Default: `true`
63+
- `--reflect-url``api-proxy` `/reflect` base URL for structured-output calls. Default: `http://127.0.0.1:8080/reflect`
64+
- `--triage-model` — Model override for Phase 1 `/reflect` triage
65+
- `--triage-max-bytes` — Maximum bytes per artifact to inline during triage
66+
- `--triage-retries` — Retries for malformed structured-output responses
6267
- `--version` — Print version and exit
6368

69+
`--reflect-url` can also be supplied with `THREAT_DETECTION_REFLECT_URL`,
70+
`API_PROXY_REFLECT_URL`, or `REFLECT_URL`. By default, `threat-detect` first
71+
tries a non-agentic `/reflect` call with a strict JSON schema matching the result
72+
contract. An all-false valid triage result exits successfully without the full
73+
detector. Threats, uncertainty, unsupported models, proxy errors, or malformed
74+
responses fail safe into the full detector. The full detector preserves the
75+
existing CLI engine behavior and prefers `/reflect` structured output when a
76+
schema-capable model is available.
77+
6478
**Exit codes:**
6579
- `0` — Safe (no threats detected)
6680
- `1` — Threat detected
@@ -131,13 +145,13 @@ The extraction staging model is:
131145
- Stage 3: `github/gh-aw` integration
132146

133147
Stage 1 is functionally represented in this repository.
134-
The standalone Go CLI, artifact reader, prompt builder, result parser, engine abstraction, W3C-style specification, unit tests, CI, Dockerfile, and release workflow are present.
148+
The standalone Go CLI, artifact reader, prompt builder, two-phase `/reflect` triage, result parser, engine abstraction, W3C-style specification, unit tests, CI, Dockerfile, and release workflow are present.
135149
Remaining work involves integration with `github/gh-aw` and production hardening of the container runtime in Stage 2/3, not additional JavaScript porting in this repository.
136150

137151
Decisions for the unresolved extraction questions:
138152

139153
- **JavaScript scripts**: detection setup and result parsing are implemented in Go here; the old GitHub Actions JavaScript scripts should not be needed once `gh-aw` switches to the container contract.
140-
- **Engine CLIs**: do not bundle Copilot, Claude, or Codex CLIs into the detector image. The detector invokes the selected engine CLI from `PATH` and forwards the `--model` value. Production `gh-aw` integration should install or provide the selected engine CLI in the detection job, then run the pinned detector binary extracted from the detector image in that same runner/AWF environment. This keeps the image small, avoids runtime installation inside the image, and reuses the existing engine installation/authentication path.
154+
- **Engine CLIs and `/reflect`**: do not bundle Copilot, Claude, or Codex CLIs into the detector image. The detector invokes the selected engine CLI from `PATH` and forwards the `--model` value when full CLI analysis is needed. When `--reflect-url` is configured, the detector can call `api-proxy` directly for structured-output triage and schema-capable full analysis before falling back to CLI behavior. Production `gh-aw` integration should install or provide the selected engine CLI in the detection job, then run the pinned detector binary extracted from the detector image in that same runner/AWF environment. This keeps the image small, avoids runtime installation inside the image, and reuses the existing engine installation/authentication path.
141155
- **Custom steps**: custom `threat-detection.steps` remain orchestrator-owned. They should run before or after the container in the `gh-aw` job rather than being passed into this container as arbitrary scripts.
142156
- **Backward compatibility**: do not ship a long-lived dual-mode compatibility window. Stage 3 should switch `gh-aw` to the pinned detector image path after Stage 4 validation passes; users that need inline detection can pin an older `gh-aw` release. A temporary internal fallback is acceptable during implementation only, but should not become a documented public feature flag unless Stage 4 exposes a blocking compatibility issue.
143157
- **Ollama/LlamaGuard**: keep this as a custom-step pattern unless a dedicated image variant is explicitly required.

cmd/threat-detect/main.go

Lines changed: 124 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -21,16 +21,20 @@ import (
2121
"fmt"
2222
"os"
2323
"os/signal"
24+
"strconv"
2425

2526
"github.com/github/gh-aw-threat-detection/pkg/artifacts"
2627
"github.com/github/gh-aw-threat-detection/pkg/detector"
2728
"github.com/github/gh-aw-threat-detection/pkg/engine"
2829
)
2930

3031
const (
31-
exitSafe = 0
32-
exitThreat = 1
33-
exitError = 2
32+
exitSafe = 0
33+
exitThreat = 1
34+
exitError = 2
35+
36+
fullDetectionCorrectionSummaryFormat = "Your previous response did not contain a valid %s JSON object"
37+
fullDetectionCorrectionInstructionFormat = "Return exactly one corrected result line using the required %s prefix."
3438
)
3539

3640
func main() {
@@ -42,18 +46,28 @@ func run() int {
4246
defer stop()
4347

4448
var (
45-
engineID string
46-
model string
47-
promptFile string
48-
outputJSON string
49-
version bool
49+
engineID string
50+
model string
51+
promptFile string
52+
outputJSON string
53+
version bool
54+
triage bool
55+
reflectURL string
56+
triageModel string
57+
triageMaxBytes int
58+
triageRetries int
5059
)
5160

5261
flag.StringVar(&engineID, "engine", "", "AI engine to use (copilot, claude, codex)")
5362
flag.StringVar(&model, "model", "", "Model to use for detection")
5463
flag.StringVar(&promptFile, "prompt-template", "", "Path to custom prompt template (defaults to built-in)")
5564
flag.StringVar(&outputJSON, "output", "", "Path to write JSON result (defaults to stdout)")
5665
flag.BoolVar(&version, "version", false, "Print version and exit")
66+
flag.BoolVar(&triage, "triage", envBool("THREAT_DETECTION_TRIAGE", true), "Run Phase 1 structured-output triage before full detection (env: THREAT_DETECTION_TRIAGE)")
67+
flag.StringVar(&reflectURL, "reflect-url", envFirstOrDefault(engine.DefaultReflectURL, "THREAT_DETECTION_REFLECT_URL", "API_PROXY_REFLECT_URL", "REFLECT_URL"), "api-proxy reflect base URL")
68+
flag.StringVar(&triageModel, "triage-model", os.Getenv("THREAT_DETECTION_TRIAGE_MODEL"), "Model to use for reflect triage")
69+
flag.IntVar(&triageMaxBytes, "triage-max-bytes", envInt("THREAT_DETECTION_TRIAGE_MAX_BYTES", detector.DefaultTriageMaxBytes()), "Maximum bytes per artifact to inline for triage")
70+
flag.IntVar(&triageRetries, "triage-retries", envInt("THREAT_DETECTION_TRIAGE_RETRIES", 1), "Retries for malformed structured outputs")
5771
flag.Parse()
5872

5973
if version {
@@ -77,6 +91,27 @@ func run() int {
7791
return exitError
7892
}
7993

94+
if triage && reflectURL != "" {
95+
triagePrompt, err := detector.BuildTriagePrompt(arts, triageMaxBytes)
96+
if err == nil {
97+
triageResult, err := (&engine.ReflectClient{
98+
BaseURL: reflectURL,
99+
Model: triageModel,
100+
Retries: triageRetries,
101+
}).AnalyzeStructured(ctx, triagePrompt)
102+
if err == nil && triageResult.IsSafe() {
103+
return writeResult(triageResult, outputJSON)
104+
}
105+
if err != nil {
106+
fmt.Fprintf(os.Stderr, "Triage inconclusive, running full detection: %v\n", err)
107+
} else {
108+
fmt.Fprintln(os.Stderr, "Triage found possible threats, running full detection")
109+
}
110+
} else {
111+
fmt.Fprintf(os.Stderr, "Error building triage prompt, running full detection: %v\n", err)
112+
}
113+
}
114+
80115
// Build the prompt
81116
promptTemplate := ""
82117
if promptFile != "" {
@@ -94,28 +129,59 @@ func run() int {
94129
return exitError
95130
}
96131

132+
if reflectURL != "" {
133+
reflectResult, err := (&engine.ReflectClient{
134+
BaseURL: reflectURL,
135+
Model: firstNonEmpty(model, triageModel),
136+
Retries: triageRetries,
137+
}).AnalyzeStructured(ctx, prompt)
138+
if err == nil {
139+
return writeResult(reflectResult, outputJSON)
140+
}
141+
fmt.Fprintf(os.Stderr, "Structured reflect detection unavailable, using CLI engine: %v\n", err)
142+
}
143+
97144
// Create engine
98145
eng, err := engine.New(engineID, model)
99146
if err != nil {
100147
fmt.Fprintf(os.Stderr, "Error creating engine: %v\n", err)
101148
return exitError
102149
}
103150

104-
// Run detection
105-
rawOutput, err := eng.Analyze(ctx, prompt)
151+
result, err := analyzeWithRetries(ctx, eng, prompt, triageRetries)
106152
if err != nil {
107153
fmt.Fprintf(os.Stderr, "Error running detection: %v\n", err)
108154
return exitError
109155
}
110156

111-
// Parse result
112-
result, err := detector.ParseResult(rawOutput)
113-
if err != nil {
114-
fmt.Fprintf(os.Stderr, "Error parsing result: %v\n", err)
115-
return exitError
157+
return writeResult(result, outputJSON)
158+
}
159+
160+
func analyzeWithRetries(ctx context.Context, eng engine.Engine, prompt string, retries int) (*detector.Result, error) {
161+
attempts := retries + 1
162+
if attempts < 1 {
163+
attempts = 1
164+
}
165+
currentPrompt := prompt
166+
var lastErr error
167+
for i := 0; i < attempts; i++ {
168+
rawOutput, err := eng.Analyze(ctx, currentPrompt)
169+
if err != nil {
170+
return nil, err
171+
}
172+
result, err := detector.ParseResult(rawOutput)
173+
if err == nil {
174+
return result, nil
175+
}
176+
lastErr = err
177+
summary := fmt.Sprintf(fullDetectionCorrectionSummaryFormat, detector.ResultPrefix)
178+
instruction := fmt.Sprintf(fullDetectionCorrectionInstructionFormat, detector.ResultPrefix)
179+
currentPrompt = detector.BuildCorrectionPrompt(prompt, summary, err.Error(), instruction)
116180
}
181+
return nil, lastErr
182+
}
117183

118-
// Output result
184+
func writeResult(result *detector.Result, outputJSON string) int {
119185
jsonBytes, err := json.MarshalIndent(result, "", " ")
120186
if err != nil {
121187
fmt.Fprintf(os.Stderr, "Error marshaling result: %v\n", err)
@@ -137,3 +203,45 @@ func run() int {
137203
}
138204
return exitSafe
139205
}
206+
207+
func envFirstOrDefault(fallback string, keys ...string) string {
208+
for _, key := range keys {
209+
if value := os.Getenv(key); value != "" {
210+
return value
211+
}
212+
}
213+
return fallback
214+
}
215+
216+
func envBool(key string, fallback bool) bool {
217+
value := os.Getenv(key)
218+
if value == "" {
219+
return fallback
220+
}
221+
parsed, err := strconv.ParseBool(value)
222+
if err != nil {
223+
return fallback
224+
}
225+
return parsed
226+
}
227+
228+
func envInt(key string, fallback int) int {
229+
value := os.Getenv(key)
230+
if value == "" {
231+
return fallback
232+
}
233+
parsed, err := strconv.Atoi(value)
234+
if err != nil {
235+
return fallback
236+
}
237+
return parsed
238+
}
239+
240+
func firstNonEmpty(values ...string) string {
241+
for _, value := range values {
242+
if value != "" {
243+
return value
244+
}
245+
}
246+
return ""
247+
}

cmd/threat-detect/main_test.go

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
package main
2+
3+
import (
4+
"encoding/json"
5+
"flag"
6+
"net/http"
7+
"net/http/httptest"
8+
"os"
9+
"path/filepath"
10+
"strings"
11+
"sync/atomic"
12+
"testing"
13+
)
14+
15+
func TestRunReflectUnavailableFallsBackToAgenticEngine(t *testing.T) {
16+
var reflectRequests atomic.Int32
17+
reflectServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
18+
reflectRequests.Add(1)
19+
http.Error(w, "reflect not implemented", http.StatusNotImplemented)
20+
}))
21+
defer reflectServer.Close()
22+
23+
artifactsDir := t.TempDir()
24+
outputPath := filepath.Join(t.TempDir(), "result.json")
25+
copilotMarker := filepath.Join(t.TempDir(), "copilot-called")
26+
fakeBinDir := writeFakeCopilot(t, copilotMarker, `THREAT_DETECTION_RESULT:{"prompt_injection":true,"secret_leak":false,"malicious_patch":false,"reasons":["agentic fallback"]}`)
27+
28+
code := runWithTestArgs(t, []string{
29+
"threat-detect",
30+
"-reflect-url", reflectServer.URL,
31+
"-output", outputPath,
32+
artifactsDir,
33+
}, map[string]string{
34+
"PATH": fakeBinDir + string(os.PathListSeparator) + os.Getenv("PATH"),
35+
})
36+
37+
if code != exitThreat {
38+
t.Fatalf("run() exit code = %d, want %d", code, exitThreat)
39+
}
40+
if reflectRequests.Load() == 0 {
41+
t.Fatal("expected /reflect to be attempted before fallback")
42+
}
43+
if _, err := os.Stat(copilotMarker); err != nil {
44+
t.Fatalf("expected copilot fallback to run: %v", err)
45+
}
46+
result := readResultFile(t, outputPath)
47+
if !result["prompt_injection"].(bool) {
48+
t.Fatalf("fallback result prompt_injection = false, want true: %#v", result)
49+
}
50+
}
51+
52+
func TestRunReflectSuccessDoesNotInvokeAgenticEngine(t *testing.T) {
53+
var postRequests atomic.Int32
54+
reflectServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
55+
switch r.Method {
56+
case http.MethodGet:
57+
w.Header().Set("Content-Type", "application/json")
58+
_, _ = w.Write([]byte(`{"models":[{"id":"schema","provider":"openai","capabilities":{"json_schema":true}}]}`))
59+
case http.MethodPost:
60+
postRequests.Add(1)
61+
w.Header().Set("Content-Type", "application/json")
62+
_, _ = w.Write([]byte(`{"prompt_injection":false,"secret_leak":false,"malicious_patch":false,"reasons":[]}`))
63+
default:
64+
http.Error(w, "unexpected method", http.StatusMethodNotAllowed)
65+
}
66+
}))
67+
defer reflectServer.Close()
68+
69+
artifactsDir := t.TempDir()
70+
outputPath := filepath.Join(t.TempDir(), "result.json")
71+
copilotMarker := filepath.Join(t.TempDir(), "copilot-called")
72+
fakeBinDir := writeFakeCopilot(t, copilotMarker, "copilot should not run")
73+
74+
code := runWithTestArgs(t, []string{
75+
"threat-detect",
76+
"-triage=false",
77+
"-reflect-url", reflectServer.URL,
78+
"-output", outputPath,
79+
artifactsDir,
80+
}, map[string]string{
81+
"PATH": fakeBinDir + string(os.PathListSeparator) + os.Getenv("PATH"),
82+
})
83+
84+
if code != exitSafe {
85+
t.Fatalf("run() exit code = %d, want %d", code, exitSafe)
86+
}
87+
if postRequests.Load() == 0 {
88+
t.Fatal("expected successful structured /reflect detection")
89+
}
90+
if _, err := os.Stat(copilotMarker); !os.IsNotExist(err) {
91+
t.Fatalf("copilot should not run when /reflect succeeds, stat err = %v", err)
92+
}
93+
result := readResultFile(t, outputPath)
94+
if result["prompt_injection"].(bool) || result["secret_leak"].(bool) || result["malicious_patch"].(bool) {
95+
t.Fatalf("reflect result is not safe: %#v", result)
96+
}
97+
}
98+
99+
func runWithTestArgs(t *testing.T, args []string, env map[string]string) int {
100+
t.Helper()
101+
102+
originalArgs := os.Args
103+
originalFlags := flag.CommandLine
104+
t.Cleanup(func() {
105+
os.Args = originalArgs
106+
flag.CommandLine = originalFlags
107+
})
108+
os.Args = args
109+
flag.CommandLine = flag.NewFlagSet(args[0], flag.ContinueOnError)
110+
111+
for key, value := range env {
112+
t.Setenv(key, value)
113+
}
114+
115+
return run()
116+
}
117+
118+
func writeFakeCopilot(t *testing.T, markerPath, output string) string {
119+
t.Helper()
120+
121+
binDir := t.TempDir()
122+
scriptPath := filepath.Join(binDir, "copilot")
123+
script := strings.Join([]string{
124+
"#!/bin/sh",
125+
"cat >/dev/null",
126+
"printf called > " + shellQuote(markerPath),
127+
"printf '%s\\n' " + shellQuote(output),
128+
"",
129+
}, "\n")
130+
if err := os.WriteFile(scriptPath, []byte(script), 0o700); err != nil {
131+
t.Fatalf("writing fake copilot: %v", err)
132+
}
133+
return binDir
134+
}
135+
136+
func readResultFile(t *testing.T, path string) map[string]any {
137+
t.Helper()
138+
139+
data, err := os.ReadFile(path)
140+
if err != nil {
141+
t.Fatalf("reading result file: %v", err)
142+
}
143+
var result map[string]any
144+
if err := json.Unmarshal(data, &result); err != nil {
145+
t.Fatalf("parsing result JSON: %v", err)
146+
}
147+
return result
148+
}
149+
150+
func shellQuote(value string) string {
151+
return "'" + strings.ReplaceAll(value, "'", "'\\''") + "'"
152+
}

0 commit comments

Comments
 (0)