feat(form-documents): improve delivery mode and conditional page guidance

danielnaab · danielnaab · commit 2042151ee962 · 2026-05-07T05:25:43.000Z
Delivery mode: removed overly conservative "default to static" and
replaced with content-complexity-based criteria (narrative fields,
sensitive topics, eligibility logic → conversational).

Conditional pages: added explicit instructions for deriving page-level
conditions from field-level conditions, with a worked example in the
schema. Modest improvement (+6.3pp) but the inference remains hard for
a prompt-only approach.

Results: overall 77.1% (+19.8pp vs baseline). Delivery mode regression
eliminated. Conditional use improved from 37.5% to 43.8%.
diff --git a/catalog/experiments/layout-quality/sonnet-hybrid-layout-v1.json b/catalog/experiments/layout-quality/sonnet-hybrid-layout-v1.json
@@ -3,97 +3,97 @@
   "implementation": "sonnet-hybrid-layout-v1",
   "specVersion": "2026-05-06",
   "status": "current",
-  "timestamp": "2026-05-06T07:49:47.601Z",
+  "timestamp": "2026-05-06T08:10:01.163Z",
   "model": "Claude Sonnet 4 (hybrid + layout)",
   "summary": {
-    "pageSizing": 0.8125,
+    "pageSizing": 0.6875,
     "topicCohesion": 0.875,
-    "logicalProgression": 0.875,
-    "conditionalUse": 0.375,
-    "titleClarity": 1,
-    "deliveryModeChoice": 0.5625,
-    "overall": 0.75
+    "logicalProgression": 0.9375,
+    "conditionalUse": 0.4375,
+    "titleClarity": 0.9375,
+    "deliveryModeChoice": 0.75,
+    "overall": 0.7708333333333333
   },
   "cases": [
     {
       "fixture": "pardon-application",
       "metrics": {
         "pageSizing": 0.5,
         "topicCohesion": 0.75,
-        "logicalProgression": 0.75,
-        "conditionalUse": 0.25,
+        "logicalProgression": 1,
+        "conditionalUse": 0.5,
         "titleClarity": 1,
-        "deliveryModeChoice": 0.5,
-        "overall": 0.625
+        "deliveryModeChoice": 0.75,
+        "overall": 0.75
       },
       "details": {
         "rawScores": {
           "pageSizing": {
             "score": 3,
-            "rationale": "Page 1 has 32 fields which is quite large for a single page, while pages like page 4 and page 7 have only 1-5 fields; the distribution is uneven though splitting page 1 further could be warranted."
+            "rationale": "Page 1 has 32 fields which is quite large and could overwhelm users, while pages like 4, 7, and 9 have only 1-2 fields; splitting the background information into identity, address/contact, and demographics would improve usability."
           },
           "topicCohesion": {
             "score": 4,
-            "rationale": "Most pages group related topics well, though page 5 combines sobriety/substance use with financial information which are somewhat distinct sensitive topics, and page 2 combines residence history with job history."
+            "rationale": "Most pages group related topics well (military, case background, certifications), though page 5 combines substance use and financial matters which are somewhat distinct sensitive topics, and page 3 mixes housing and employment."
           },
           "logicalProgression": {
-            "score": 4,
-            "rationale": "The flow from personal info → living/work history → education/military → community → health/finances → criminal history → reasons → references → certification is logical, though placing reasons for pardon after criminal history rather than before is slightly unusual."
+            "score": 5,
+            "rationale": "The flow moves naturally from identity to background history, then personal growth, sensitive matters, the actual conviction details, reasons for pardon, and finally legal certifications and references."
           },
           "conditionalUse": {
-            "score": 2,
-            "rationale": "The DataCollectionSpec has clear conditional fields (military service details conditional on serving, substance use details conditional on having struggled, previous application dates conditional on having applied before) but no page-level conditions are used anywhere."
+            "score": 3,
+            "rationale": "Military service and previous application details have conditional relevance but the form doesn't use page-level conditions to skip them for non-applicable users, and substance use history could also be conditionally shown."
           },
           "titleClarity": {
             "score": 5,
-            "rationale": "All page titles are plain-language, conversational, and clearly communicate what the user will be asked (e.g., 'Tell us about yourself', 'Why you're seeking a pardon', 'Sign and submit your application')."
+            "rationale": "All page titles are plain-language, user-friendly, and clearly communicate what the user will be asked about without jargon or bureaucratic numbering."
           },
           "deliveryModeChoice": {
-            "score": 3,
-            "rationale": "Page 5 (substance use and finances) appropriately uses conversational mode for sensitive topics, and page 3 uses hybrid for conditional military content, but page 6 (criminal history with complex narrative fields about conduct and responsibility) being static is suboptimal, and the reasons-for-pardon page could also benefit from conversational delivery."
+            "score": 4,
+            "rationale": "Conversational mode is well-chosen for sensitive topics like substance use, conviction details, and reasons for pardon; however, the certification/signatures page might be better as static since it requires precise legal acknowledgments rather than dialogue."
           }
         },
         "pageCount": 9,
-        "fieldCount": 128,
+        "fieldCount": 76,
         "groupCount": 13
       }
     },
     {
       "fixture": "i-9",
       "metrics": {
-        "pageSizing": 0.75,
+        "pageSizing": 0.5,
         "topicCohesion": 1,
         "logicalProgression": 0.75,
         "conditionalUse": 0.25,
-        "titleClarity": 1,
+        "titleClarity": 0.75,
         "deliveryModeChoice": 0.5,
-        "overall": 0.7083333333333334
+        "overall": 0.625
       },
       "details": {
         "rawScores": {
           "pageSizing": {
-            "score": 4,
-            "rationale": "Page 1 has 20 fields and page 2 has 20 fields which is on the larger side but acceptable for a government form; pages 3 and 4 have 9-12 fields which is well-sized."
+            "score": 3,
+            "rationale": "Page 1 has 20 fields and page 2 has 20 fields, which are large but manageable given they map to logical form sections; however, page 1 could benefit from being split into personal info and immigration status sub-pages."
           },
           "topicCohesion": {
             "score": 5,
-            "rationale": "Each page maps directly to one logical group from the I-9 form structure, maintaining perfect topic cohesion within each page."
+            "rationale": "Each page maps directly to a single logical group from the I-9 form structure, maintaining perfect topic cohesion within each page."
           },
           "logicalProgression": {
             "score": 4,
-            "rationale": "The flow follows the official I-9 section order logically, though placing the preparer/translator section after employer verification rather than after employee information slightly deviates from the actual form completion sequence."
+            "rationale": "The flow from employee info to employer verification to preparer to reverification follows the official I-9 section order, though placing preparer certification after employer verification is slightly odd since it relates to Section 1."
           },
           "conditionalUse": {
             "score": 2,
-            "rationale": "Pages 3 and 4 are clearly conditional (only needed if a preparer assisted or for reverification/rehire) but have no page-level conditions defined; additionally, immigration-related fields in page 1 could benefit from conditional logic based on citizenship status."
+            "rationale": "The form has clearly conditional sections (preparer/translator only applies if someone assisted, reverification only for rehires, immigration fields conditional on citizenship status) but no page-level conditions are defined."
           },
           "titleClarity": {
-            "score": 5,
-            "rationale": "Titles like 'Tell us about yourself,' 'Employer document review,' and 'Preparer or translator assistance' are plain-language, descriptive, and help users immediately understand each page's purpose."
+            "score": 4,
+            "rationale": "Titles like 'Tell us about yourself,' 'Document verification,' and 'Preparer assistance' are plain-language and descriptive, though 'Tell us about yourself' slightly undersells the citizenship attestation component."
           },
           "deliveryModeChoice": {
             "score": 3,
-            "rationale": "Page 1 appropriately uses hybrid mode given its mix of simple identity fields and complex citizenship/immigration attestation, but pages 3 and 4 could benefit from conversational mode since they involve conditional logic about whether they apply at all."
+            "rationale": "Using conversational mode for the employee section makes sense given conditional immigration fields, but the employer verification section with complex document lists would benefit more from conversational/hybrid guidance, while the simple preparer fields being static is appropriate."
           }
         },
         "pageCount": 4,
@@ -104,44 +104,44 @@
     {
       "fixture": "w-9",
       "metrics": {
-        "pageSizing": 1,
+        "pageSizing": 0.75,
         "topicCohesion": 0.75,
         "logicalProgression": 1,
         "conditionalUse": 0.5,
         "titleClarity": 1,
-        "deliveryModeChoice": 0.5,
+        "deliveryModeChoice": 0.75,
         "overall": 0.7916666666666666
       },
       "details": {
         "rawScores": {
           "pageSizing": {
-            "score": 5,
-            "rationale": "19 fields across 4 pages yields an average of ~5 fields per page, which is well-balanced for this form's complexity."
+            "score": 4,
+            "rationale": "19 fields spread across 4 pages is reasonable; page 1 has 6 fields and page 4 has 6 fields which are appropriately sized, though page 3 with only 2 fields is slightly thin."
           },
           "topicCohesion": {
             "score": 4,
-            "rationale": "Most pages have clear topical focus, though combining entity-information and exemptions on page 1 mixes two distinct groups—albeit related enough to work together."
+            "rationale": "Most pages are cohesive, though page 1 mixes entity identification with address information (two distinct groups), and account numbers are oddly placed with address rather than with taxpayer identification."
           },
           "logicalProgression": {
             "score": 5,
-            "rationale": "The flow from entity identification to address to TIN to certification/signature follows the natural W-9 order and moves from easier to more sensitive information."
+            "rationale": "The flow from identity → tax classification → TIN → certification/signature follows the natural W-9 order and moves from easy to sensitive information logically."
           },
           "conditionalUse": {
             "score": 3,
-            "rationale": "The LLC tax classification field is conditional on selecting LLC, and the foreign partners indicator applies only to certain entity types, yet no page-level conditions are used to handle these cases."
+            "rationale": "The LLC tax classification field is conditional on selecting LLC, and the foreign partners indicator is situational, but no page-level conditions are used to handle these cases."
           },
           "titleClarity": {
             "score": 5,
-            "rationale": "All page titles are plain-language, action-oriented, and clearly communicate what information the user will provide on each page."
+            "rationale": "Titles like 'Tell us about yourself,' 'Tax classification and exemptions,' 'Taxpayer identification,' and 'Certification and signature' are clear, plain-language, and descriptive."
           },
           "deliveryModeChoice": {
-            "score": 3,
-            "rationale": "Page 1 uses hybrid which is reasonable given the conditional LLC classification, but pages 3 and 4 could benefit from conversational mode since TIN entry requires choosing between SSN/EIN and certification involves understanding legal statements."
+            "score": 4,
+            "rationale": "Using conversational mode for the sensitive TIN page and certification is smart, and hybrid for the conditional tax classification section is appropriate, though the static mode for page 1 is also fitting for straightforward fields."
           }
         },
         "pageCount": 4,
         "fieldCount": 19,
-        "groupCount": 5
+        "groupCount": 6
       }
     },
     {
@@ -152,34 +152,34 @@
         "logicalProgression": 1,
         "conditionalUse": 0.5,
         "titleClarity": 1,
-        "deliveryModeChoice": 0.75,
-        "overall": 0.875
+        "deliveryModeChoice": 1,
+        "overall": 0.9166666666666666
       },
       "details": {
         "rawScores": {
           "pageSizing": {
             "score": 5,
-            "rationale": "Each page has 6-9 fields, which is appropriate for a 43-field form spread across 6 pages—neither too dense nor over-paginated."
+            "rationale": "Each page has 6-9 fields, which is well-balanced for a 43-field form spread across 6 pages, avoiding both overcrowding and over-pagination."
           },
           "topicCohesion": {
             "score": 5,
-            "rationale": "Each page maps directly to one cohesive data group (personal info, household, income, assets, expenses, signature), maintaining clear topical focus."
+            "rationale": "Each page maps directly to one cohesive data group with clearly related fields (personal info, household, income, assets, expenses, signature)."
           },
           "logicalProgression": {
             "score": 5,
-            "rationale": "The flow moves naturally from identity → household → income → assets → expenses → certification/signature, following standard benefits application logic and building from simple to sensitive."
+            "rationale": "The flow moves naturally from identity → household → income → assets → expenses → review/signature, following standard benefits application logic and progressing from easy to more complex/sensitive."
           },
           "conditionalUse": {
             "score": 3,
-            "rationale": "The DataCollectionSpec has optional household members and conditional-like fields (e.g., self-employment, authorized representative) that could benefit from page-level conditions, but none are used."
+            "rationale": "The household composition and self-employment fields could benefit from page-level conditions (e.g., only showing household members if applicable), but no conditional logic is used despite optional field groups."
           },
           "titleClarity": {
             "score": 5,
-            "rationale": "All titles use plain, friendly language ('Tell us about yourself', 'Your monthly expenses') that clearly communicates what the user will be asked on each page."
+            "rationale": "All titles are plain-language, user-friendly, and clearly describe what the user will be asked on each page (e.g., 'Your income sources', 'Your monthly expenses')."
           },
           "deliveryModeChoice": {
-            "score": 4,
-            "rationale": "Income and expenses are appropriately set to hybrid given their conditional complexity, and the signature page uses conversational mode for guidance, though assets could also benefit from hybrid mode given vehicle/property conditionality."
+            "score": 5,
+            "rationale": "Static mode is appropriate for straightforward factual fields (personal info, household), hybrid for moderately complex financial sections (income, assets, expenses), and conversational for the review/expedited screening questions that benefit from guided interaction."
           }
         },
         "pageCount": 6,
diff --git a/src/services/form-documents/layout-prompt.ts b/src/services/form-documents/layout-prompt.ts
@@ -47,18 +47,18 @@ Apply these civic tech best practices when assigning groups to pages:
 2. **Front-load easy questions** — place simple, low-effort fields (name, contact info) on early pages to build momentum before complex sections.
 3. **Group for recognition** — related fields together reduce cognitive load. Users should recognize why fields appear on the same page.
 4. **Use plain-language titles** — page titles should describe what the user will do, not internal jargon (e.g., "Tell us about yourself" not "Personal Information Section A").
-5. **Conditional pages** — if a group has a condition, place it on its own page so it can be skipped entirely without confusing the user.
+5. **Conditional pages** — scan the DataCollectionSpec for groups where most or all requirements share the same condition (e.g., multiple fields with "condition": {"field": "hasServedInMilitary", "operator": "equals", "value": "Yes"}). When you find such a group, place it on its own page and add that same condition to the page. The "gate" question (the field referenced in the condition) must appear on a PRIOR page so the system knows whether to show or skip the conditional page. Example: if fields about military details all require hasServedInMilitary == "Yes", put those fields' group on a page with "condition": {"field": "hasServedInMilitary", "operator": "equals", "value": "Yes"}, and ensure the hasServedInMilitary field itself is on an earlier page.
 6. **Don't over-paginate** — avoid single-field pages unless justified by sensitivity or conditionality. Two closely related groups can share a page.
 
 ## deliveryMode assignment
 
 Assign a deliveryMode to each page based on its content:
 
-- **static** — straightforward fields with clear labels (name, date, address). Most pages should be static.
-- **conversational** — sections with many conditional fields, complex eligibility logic, or questions that benefit from guided explanation.
-- **hybrid** — moderately complex sections where some fields are straightforward but others may need clarification.
+- **static** — straightforward factual fields with clear labels where the user knows the answer immediately (name, date of birth, mailing address).
+- **conversational** — sections involving: narrative free-text fields, sensitive topics (criminal history, substance use, legal attestations), complex eligibility logic, or fields where users commonly need guidance to understand what's being asked.
+- **hybrid** — pages mixing simple factual fields with one or two that may need clarification (e.g., an address page that also asks about mailing preferences).
 
-Default to "static" unless the page content clearly warrants conversational or hybrid treatment.
+Choose based on the content's complexity, not just field count. A page with 3 narrative fields about criminal conduct is more complex than a page with 8 address fields.
 
 ## FormSpec JSON schema
 
@@ -70,11 +70,19 @@ Return ONLY valid JSON (no markdown fences, no explanation) matching this schema
   "title": "string — a user-friendly form title",
   "pages": [
     {
-      "id": "page-<n>",
+      "id": "page-1",
       "title": "string — plain-language page title",
       "description": "string (optional) — brief guidance for the user",
-      "groups": ["group-id-1", "group-id-2"],
-      "deliveryMode": "static | conversational | hybrid"
+      "groups": ["group-id-1"],
+      "deliveryMode": "static"
+    },
+    {
+      "id": "page-2",
+      "title": "Your military service",
+      "description": "Tell us about your time in the armed forces.",
+      "groups": ["military-service-details"],
+      "condition": { "field": "hasServedInMilitary", "operator": "equals", "value": "Yes" },
+      "deliveryMode": "hybrid"
     }
   ],
   "createdAt": "${new Date().toISOString()}",
@@ -83,6 +91,8 @@ Return ONLY valid JSON (no markdown fences, no explanation) matching this schema
 
 Each page's "groups" array references group IDs from the DataCollectionSpec. Every group must appear in exactly one page.
 
+The "condition" property is OPTIONAL — use it when a page's content only applies to users who answered a specific way on a previous page. The "field" must reference a fieldName from the DataCollectionSpec. When a condition is present, the page is skipped if the condition is not met.
+
 ## DataCollectionSpec
 
 ${JSON.stringify(spec, null, 2)}