Skip to content

Improve default Document Description Updater prompt#2076

Merged
JSv4 merged 2 commits into
mainfrom
improve/document-description-prompt
Jun 29, 2026
Merged

Improve default Document Description Updater prompt#2076
JSv4 merged 2 commits into
mainfrom
improve/document-description-prompt

Conversation

@JSv4

@JSv4 JSv4 commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

What

Rewrites the seeded Document Description Updater action-template prompt so per-document one-liners lead with the document's subject matter instead of restating its type or title.

Why

The previous prompt asked for "what this document is about, its type (contract, memo, report, etc.), and the key parties," which produced descriptions like "This document is the Fourth Amendment to the Lease Agreement titled …" — the lead-in and title restatement carry little information.

Change

New prompt directs the agent to:

  • Write one concise sentence leading with the subject matter — the goods, services, rights, or obligations the document concerns — and the key parties.
  • Not begin with "This document is" / "This is".
  • Not restate the title; mention the instrument type only as brief context.

Example shift for the same document:

  • Before: "This document is a contract renewal agreement titled 'FIRST RENEWAL FOR CITY SECRETARY CONTRACT NO. 23685' …"
  • After: "Renewal of an agreement with Fort Worth Teen Court, Inc. for youth justice and diversion services for the City of Fort Worth."

Files

  • opencontractserver/corpuses/template_seeds.py — prompt extracted to DOCUMENT_DESCRIPTION_INSTRUCTIONS and rewritten.
  • opencontractserver/agents/migrations/0017_update_document_description_prompt.py — data migration that re-syncs already-seeded databases (the seeder skips templates that exist by name, mirroring agents/0016).
  • changelog.d/document-description-prompt.changed.md.

Rewrite the seeded Document Description Updater action-template prompt to
lead with the document's subject matter (the goods, services, rights, or
obligations it concerns) and key parties, drop the "This document is …"
lead-in, and stop restating the title.

- Extract the prompt to DOCUMENT_DESCRIPTION_INSTRUCTIONS in template_seeds.py.
- Add data migration agents/0017 to re-sync already-seeded databases (the
  seeder skips templates that already exist by name, so the literal edit does
  not reach them on its own).
@claude

claude Bot commented Jun 27, 2026

Copy link
Copy Markdown

Review: Improve default Document Description Updater prompt

This PR rewrites the seeded prompt for the Document Description Updater action template — shifting from a type-first, multi-sentence description to a subject-matter-first single sentence — and pairs it with a data migration to propagate the change to already-seeded databases. The approach (extracted constant + idempotent migration) is well-structured and mirrors the pattern established by migration 0016. Two findings are worth addressing before merge.


1. Stale description field contradicts the new prompt — opencontractserver/corpuses/template_seeds.py:99

CONFIRMED. The UI-visible description for the template still reads:

"description": (
    "Reads a newly added document and writes a concise description "
    "summarising its type, purpose, and key parties."
),

The new DOCUMENT_DESCRIPTION_INSTRUCTIONS explicitly instructs the agent to not lead with type, to not focus on purpose-as-type, and to mention the instrument type only as brief context. The advertised behavior ("its type, purpose, and key parties") is precisely what the new prompt was written to eliminate.

This affects both fresh installs (the seeder writes this description to the DB) and existing deployments (the data migration only updates task_instructions, not description). A maintainer looking at the template in the UI will see a description that no longer matches what the agent actually does.

Fix: Update the description string to match the new behavior — something like "Reads a newly added document and writes a one-sentence summary leading with its subject matter and key parties." — and extend the data migration to also patch description on existing rows (mirroring the task_instructions update pattern already in the migration).


2. Migration imports a live-code constant — opencontractserver/agents/migrations/0017_update_document_description_prompt.py:25

PLAUSIBLE. The migration function imports DOCUMENT_DESCRIPTION_INSTRUCTIONS directly from opencontractserver.corpuses.template_seeds at runtime:

def update_document_description_prompt(apps, schema_editor):  # pragma: no cover
    from opencontractserver.corpuses.template_seeds import (
        DOCUMENT_DESCRIPTION_INSTRUCTIONS,
    )

Django migrations are meant to be frozen snapshots. If DOCUMENT_DESCRIPTION_INSTRUCTIONS is ever renamed, moved to another module, or the file is restructured, every manage.py migrate run from a clean state will fail at this migration with ImportError — even on databases that have already applied it. This is a real deployment hazard.

Note: migration 0016 has the same issue with caml_authoring.py, so this is an existing pattern rather than newly introduced. But migration 0012 (update_corpus_agent_prompt) shows the safer approach: it inlines the old and new strings directly as module-level constants in the migration file, which makes the migration immune to future refactoring of application code. Since the DOCUMENT_DESCRIPTION_INSTRUCTIONS constant must also live in template_seeds.py for the seeder, they would diverge at the moment of inlining — but that divergence only matters when future code changes template_seeds.py, which should also update the seeder path anyway.

Fix (optional given existing precedent, but worth considering): Inline the prompt string as a local constant at the top of the migration file, rather than importing it. The constant in template_seeds.py remains for the seeder; the migration just doesn't reference it at runtime.


Minor note: reverse is a no-op

migrations.RunPython.noop is used as the reverse operation, so rolling back migration 0017 leaves the new prompt in the database. This is consistent with how migration 0016 handles it, so it is a deliberate and accepted trade-off. If rollback fidelity matters, the old string would need to be inlined in the migration (which overlaps with finding 2).

@codecov

codecov Bot commented Jun 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…trings

- Update "Document Description Updater" template's description in
  template_seeds.py to match the new subject-first prompt behaviour:
  "Reads a newly added document and writes a one-sentence summary
  leading with its subject matter and key parties."
- Extend migration 0017 to also patch the description field on already-seeded
  CorpusActionTemplate and AgentConfiguration rows (same idempotent pattern as
  task_instructions).
- Inline both prompt strings as module-level constants in the migration
  (_TASK_INSTRUCTIONS, _UI_DESCRIPTION) rather than importing from
  template_seeds, so the migration is a self-contained frozen snapshot.
@claude

claude Bot commented Jun 29, 2026

Copy link
Copy Markdown

Code Review

This is a narrow, well-motivated change: the old prompt produced descriptions that led with type/title restatement; the new one leads with subject matter. The structural pattern — extract a named constant, update the TEMPLATES dict, add an idempotent data migration — is correct and consistent with the prior art in this codebase. Three issues survived verification, one minor observation follows.


1. Misleading comment will cause the next prompt edit to silently skip production databases

File: opencontractserver/corpuses/template_seeds.py, lines 78–81

# Per-task prompt for the "Document Description Updater" template. Extracted to a
# named constant so the agents/0017 data migration can re-sync already-seeded
# databases (the seeder skips templates that already exist by name, so editing
# the literal below never reaches them on its own).

The comment implies that migration 0017 imports and uses DOCUMENT_DESCRIPTION_INSTRUCTIONS. It does not — migration 0017 intentionally inlines its own frozen copy (and explains why in its own header). As written, a future developer who edits DOCUMENT_DESCRIPTION_INSTRUCTIONS will read this comment, conclude that 0017 will propagate the change automatically, and ship without a new migration. Existing production databases would silently keep the old prompt.

Suggested replacement:

# Per-task prompt for the "Document Description Updater" template.
# This constant is the authoritative live value used by the seeder for fresh
# installs. Already-seeded databases receive prompt updates via data migrations
# (see agents/0017). If you change this constant, write a new data migration
# so existing databases are also updated.

2. No test for the migration forward function — established pattern is being skipped

File: opencontractserver/agents/migrations/0017_update_document_description_prompt.py, line 41 (# pragma: no cover)

opencontractserver/tests/test_location_tagger_agent.py (lines 42–55) imports migration 0015 via importlib and calls its forward function directly against the live app registry. This established pattern pins:

  • the exact template/agent name strings
  • which fields get updated
  • idempotency (calling it twice is a no-op)

Migration 0017 has non-trivial multi-model conditional logic (two models, two fields, four != guards, early return if template is absent) and is marked # pragma: no cover, so none of it is exercised. A typo in "Document Description Updater" or "Document Description Updater Agent" would pass CI and only surface at migrate time on a live database.

A test mirroring the 0015 pattern would take ~15 lines and would catch name drift, field coverage gaps, and idempotency regressions.


3. Changelog fragment omits the AgentConfiguration.description side-effect

File: changelog.d/document-description-prompt.changed.md, line 1

The fragment documents that 0017 re-syncs CorpusActionTemplate. It doesn't mention that the migration also updates AgentConfiguration.description for the matching "Document Description Updater Agent". An operator triaging an unexpected change to agents.AgentConfiguration after the deploy won't find it in the changelog and may misattribute it or treat it as data corruption.

Suggested addition to the fragment: "…also updates the corresponding AgentConfiguration description."


4. (Minor / Altitude) Bespoke "find-by-name, diff, save" pattern repeated without a shared helper

Migrations 0016 and 0017 both implement the same boilerplate — filter by name, compare fields, conditionally save — with no shared utility. As the prompt catalog grows, this will be copy-pasted again. A lightweight helper in opencontractserver/agents/migrations/migration_helpers.py (or similar) — update_seeded_template(apps, name, **fields) — would enforce the update_fields guard, be testable in isolation, and reduce each future migration to a handful of lines. CLAUDE.md's DRY rule applies here.

Not a blocker for this PR, but worth tracking.


Overall: the prompt change itself is correct and the migration is structurally sound. Items 1 and 3 are one-line fixes; item 2 is the most actionable ask.

@JSv4 JSv4 merged commit 673c6d8 into main Jun 29, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant