Commit 674c994
feat(resume-builder): Slice 1F — web_search via function-wrapped OpenAI built-in
User's original ask included "you have all the capabilities to access urls if
provided or browse web yourself" — Slice 1A delivered the GitHub-URL path
(fetch_github_readme), Slice 1F delivers the general-web path.
ARCHITECTURE DECISION (the non-obvious part)
First attempt: add OpenAI's built-in {"type": "web_search"} directly to
RESUME_BUILDER_TOOL_SPECS. Probed in isolation, works fine. But running
it in the actual intake loop blew up every call with:
400 - "Web Search cannot be used with JSON mode."
Our intake contract REQUIRES text.format = json_object (structured envelope
with draft_updates / assistant_message / status / etc.). OpenAI rejects the
combination at the API boundary. Adding web_search silently 400'd every
intake turn and the service fell back to the regex step-machine — exactly
the silent-fallback pattern Slice 1D pact-tests were built to catch
(the agentic eval immediately surfaced the regression: 3/10 passing).
Considered alternatives:
- Two-call pattern: intake (JSON) decides if search needed → second
call (no JSON) executes. Doubles latency, search results only
visible to the agent on the NEXT turn. Worse UX than not searching.
- External provider (Tavily / Brave / Exa) as a local function tool:
clean architecture but adds a new external dependency + API key +
cost commitment we shouldn't make without operator approval.
- Function-wrap (this slice): expose web_search as a FUNCTION tool to
the agent. When the agent calls it, our dispatcher fires its own
inner responses.create — WITHOUT json_object format, WITH OpenAI's
built-in {"type": "web_search"} — and returns the synthesized text
as the function_call_output. Main loop stays JSON-mode safe; agent
gets a research capability on-demand. Zero new dependencies, no new
API key, same shape as fetch_github_readme.
The function wrap is the cleanest landing.
WHAT LANDED
backend/services/resume_builder_tools.py
- _web_search(query, *, openai_service) — fires the inner non-JSON call
with built-in web_search tool, extracts synthesized text, caps at
8KB, returns {"ok": True, "result": str} or {"ok": False, "error": str}
- WEB_SEARCH_TOOL_SPEC — function-tool shape with required `query` arg
- _TOOLS_REQUIRING_OPENAI sentinel set so the dispatcher knows which
tools need the OpenAIService forwarded (only web_search; fetch is
HTTP-only)
- execute_tool() gains an optional openai_service kwarg, forwards to
tools that need it
backend/services/resume_builder_service.py
- _run_llm_turn now binds openai_service into the tool_executor
closure so the agent can dispatch web_search through the loop
prompts/resume_builder/v1.json + test_prompts.py byte-mirror
- "Tools you can call" block now lists web_search alongside
fetch_github_readme
- Prompt teaches the agent WHEN to use it (external context,
company / role norms, industry questions) and WHEN NOT TO
(anything the user already shared, generic advice, small talk,
speculative queries). Cites the "use sparingly" rule explicitly
because each search is a separate API call (latency + cost).
tests/backend/test_resume_builder_tools.py
- Updated test_tool_spec_includes_web_search to assert the function-
tool shape (NOT the server-side shape)
- 5 new hermetic tests for _web_search via a stubbed OpenAI client:
success path, empty-query reject, no-service reject, dispatch
exception captured (never raised), oversize-result truncation,
and a guard that fetch_github_readme does NOT receive
openai_service (would crash if it did)
tests/quality/resume_builder_agentic_runner.py
- 2 new LLM scenarios:
web_search_fires_on_external_context_question (positive — user
asks "what does Anthropic look for on a Senior MLE resume?")
web_search_skipped_for_user_provided_info (negative — user is
sharing their own background, no search should fire)
VERIFICATION
- 145 hermetic tests across affected suites green.
- 10/10 LLM scenarios pass on the live API (gpt-5.4). Inspection of
the web_search scenario shows the model fires the tool ONCE,
receives a grounded answer ("Anthropic's Senior MLE postings tend
to emphasize strong Python + ML + software engineering, production
ML systems, and measurable impact like scale, latency, reliability,
or cost improvements..."), and synthesizes a tailored reply with
no hallucination.
Cost-of-search: each web_search invocation adds one extra
responses.create call (gpt-5.4-mini, ~600 tokens). Realistic usage
per session: 0-2 invocations (prompt explicitly tells the agent to
use this sparingly). Latency: ~1-2s per search.
WHAT'S STILL PARKED (Phase 2 remainder)
- Full eval expansion to 15-20 fixtures with rubric scoring
- ADR-031 documenting the agentic shape
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent a2699d8 commit 674c994
6 files changed
Lines changed: 462 additions & 4 deletions
File tree
- backend/services
- prompts/resume_builder
- tests
- backend
- quality
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2034 | 2034 | | |
2035 | 2035 | | |
2036 | 2036 | | |
2037 | | - | |
| 2037 | + | |
| 2038 | + | |
| 2039 | + | |
| 2040 | + | |
| 2041 | + | |
| 2042 | + | |
| 2043 | + | |
| 2044 | + | |
| 2045 | + | |
| 2046 | + | |
2038 | 2047 | | |
2039 | 2048 | | |
2040 | 2049 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
279 | 279 | | |
280 | 280 | | |
281 | 281 | | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
282 | 462 | | |
283 | 463 | | |
| 464 | + | |
284 | 465 | | |
285 | 466 | | |
286 | 467 | | |
287 | 468 | | |
288 | 469 | | |
289 | 470 | | |
290 | 471 | | |
| 472 | + | |
291 | 473 | | |
292 | 474 | | |
293 | 475 | | |
294 | | - | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
295 | 489 | | |
296 | 490 | | |
297 | 491 | | |
298 | 492 | | |
299 | 493 | | |
300 | 494 | | |
301 | 495 | | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
302 | 501 | | |
303 | 502 | | |
304 | 503 | | |
| |||
333 | 532 | | |
334 | 533 | | |
335 | 534 | | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
336 | 540 | | |
337 | 541 | | |
338 | 542 | | |
| |||
0 commit comments