Skip to content

Commit a1f64b1

Browse files
committed
docs(blog): note Opus 4.8 cost basis and tighten payoff section
- Add a note that the measured token costs use Claude Opus 4.8 pricing - Remove the split-model (strong vs cheap) guidance; triage/PoC also ran on Opus - Add a closing line summarizing the per-commit token economics - Clarify dynamic triage spins up local instances of the app under test - Reword the deterministic-rescan sentence for clarity
1 parent deab819 commit a1f64b1

1 file changed

Lines changed: 5 additions & 8 deletions

File tree

src/content/blog/appsec-agent.mdx

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -147,19 +147,16 @@ Here is what that looks like in practice, on [Komga](https://github.com/gotson/k
147147
| Cost | $112 in tokens | $0 — CPU only |
148148
| Time | 2 hours 51 minutes | 1 minute 37 seconds |
149149

150+
*Token costs are measured with Claude Opus 4.8 ($5 / $25 per million input / output tokens).*
151+
150152
Across the projects we have tested so far, the one-time spend lands in the $100–$200 range. After that, every commit is just a scan — minutes of CPU, not hundreds of dollars.
151153

152154
Only the first run is cold. Those three hours went into the gap between the engine's built-in knowledge and one specific stack — the unmodeled libraries, the project's own patterns. Once distilled, that knowledge does not expire. After that, a commit means one of two things:
153155

154156
- **Nothing new crossed a boundary** — the usual case. The engine re-scans with the artifacts already in the repo, and a new finding in an old pattern costs nothing extra.
155157
- **The code crossed a boundary the artifacts do not cover** — a new library, a new kind of entry point. The agent comes back and models the delta, not the codebase.
156158

157-
Scan and triage do not need the same model:
158-
159-
- **Authoring rules and approximations** wants the strong model. Each artifact is a judgment the engine will apply everywhere, so quality there is amplified at scale — and so are mistakes.
160-
- **Triage and PoC writing** work with a much cheaper model without losing accuracy in our testing. The engine hands over the complete trace for every finding — from where the data entered to the dangerous call — so the model is judging one well-framed flow at a time, not hunting through a codebase.
161-
162-
Put the strong model where its judgment gets distilled. The analyzer's reports carry the rest.
159+
Either way, tokens are spent only on what the artifacts do not yet cover. Everything else is a scan.
163160

164161
## Get started
165162

@@ -172,7 +169,7 @@ npx skills add https://github.com/seqra/opentaint
172169
The skill offers to install the engine itself if it is missing. To run the workflow, open your coding agent in the project and ask it to find vulnerabilities. The skill asks two questions before touching anything, then works the pipeline on its own:
173170

174171
- **Scan depth** — the Lite / Normal / Deep ladder from above.
175-
- **Exploit confirmation** — whether to confirm findings with PoCs. Dynamic triage launches throwaway local instances and tears them down at the end.
172+
- **Exploit confirmation** — whether to confirm findings with PoCs. Dynamic triage launches throwaway local instances of the application under test and tears them down at the end.
176173

177174
Everything it produces lands in one `.opentaint/` directory at the project root:
178175

@@ -182,6 +179,6 @@ Everything it produces lands in one `.opentaint/` directory at the project root:
182179
- a PoC script per confirmed vulnerability
183180
- a `vulnerabilities.md` report on top
184181

185-
Keep that directory in the repo. On every commit after, the engine replays it deterministically, no agent in the loop: that is the cheap scan in the title.
182+
Keep that directory in the repo. On every commit after, you can run the engine scan with those artifacts deterministically — the agent is no longer needed.
186183

187184
The more of your code that AI writes, the more you need a formal layer underneath it — one that turns every discovery into coverage that lasts.

0 commit comments

Comments
 (0)