Skip to content

Commit d73ff9b

Browse files
authored
Merge pull request #740 from PlanExeOrg/napkin-math/phase2-extract-decomposition-rules
napkin-math: Phase 2 extract — source-arithmetic preservation, threshold-pairing parity, source_text truncation
2 parents 9531f6b + f9d90eb commit d73ff9b

3 files changed

Lines changed: 146 additions & 43 deletions

File tree

experiments/napkin_math/.claude/skills/extract-parameters-from-digest/system-prompt.txt

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ Hard limits:
5555
- Do not include values merely because they appear in the digest.
5656
- Prefer missing-but-needed modelling values over minor explicit values.
5757
- For source_text from a compressed section, use the bullet text with the inline `[…]` tag removed. For source_text from a raw section, use a short verbatim quote or paraphrase. Never include the `[…]` tag in source_text.
58+
- The 20-word cap is hard. If the source phrase exceeds 20 words, truncate to the subject-plus-threshold portion that names the variable and its bound, drop the consequence clause, and end with an ellipsis if mid-sentence. Do not paste an entire if/then conditional when only the antecedent's threshold portion is load-bearing for the key_value. Count words before emitting; if at or above the cap, shorten further.
5859

5960
Goal:
6061

@@ -377,6 +378,22 @@ Neutral patterns:
377378
- combined_coverage_ratio = available_capacity / total_required_capacity
378379
- total_required_capacity = requirement_a + requirement_b
379380

381+
Source-arithmetic preservation rule:
382+
383+
When the source explicitly relates a dependent quantity to its named components through an arithmetic operation — sum, product, ratio, fraction of a base, scaled magnitude, weighted average — the extractor MUST preserve that relationship as a recommended_first_calculation. Do not declare the dependent quantity as a flat bounded variable when the source supplies both its components and the arithmetic between them. Sampling a derived quantity independently of the components it is derived from double-counts uncertainty, lets the simulation produce physically incoherent combinations (a low component with a high "total"), and obscures the cause of any sensitivity finding.
384+
385+
Three common patterns to recognise:
386+
387+
1. Aggregate sum. Apply this pattern ONLY when the source states or clearly implies that a total is computed from named constituents — the source says "the total is A + B + C", "broken down as named line items summing to X", or otherwise aggregates the named components into the stated total. In those cases, declare the total as `aggregate = sum_of_components`, with each component in `key_values` or `missing_values_to_estimate`, and the aggregate as a recommended_first_calculation. Do NOT apply this pattern to independent caps, ceilings, committed budgets, targets, or funding envelopes simply because allocations or line items appear nearby; those remain primitive thresholds in `key_values` and are tested via the threshold-pairing rule above (spend-vs-cap margin, not sum-vs-cap identity).
388+
389+
2. Burn rate × duration. The source names a per-period or per-unit rate (a value per unit of time, area, count, person, capacity) AND a separable duration or count the rate applies over. Declare the dependent total as `total = rate * duration`. The rate and the duration each go in `key_values` or `missing_values_to_estimate`; the total is the recommended_first_calculation. Do not bound the dependent total directly when the source supplies the two operands.
390+
391+
3. Explicit decomposition block. The source explicitly does the arithmetic itself — a base quantity, one or more share fractions, possibly a perturbation magnitude, and the named resulting dependent value. Preserve every named operand and the operation between them as a calculation. Do not collapse the source's explicit math into a single flat variable; doing so loses the source's stated sensitivity to each operand.
392+
393+
Discipline shared by all three patterns: every quantity for which the source itself supplies a formula is a derived quantity, not a primitive. Primitives go in `key_values` or `missing_values_to_estimate`; derived quantities go in `recommended_first_calculations` when the relation is critical to a viability claim, or in `derived_questions` when the relation is useful but secondary. The choice between those two containers is a placement decision; it does not relax the rule that the source's stated arithmetic must be preserved as a calculation rather than collapsed into a flat variable.
394+
395+
When hard limits make this preservation feel tight, do not collapse the decomposition. If the arithmetic relation is critical to a viability claim, keep it in `recommended_first_calculations`; if it is useful but secondary, `derived_questions` is the right home; if neither fits under cap pressure, drop a less load-bearing key_value to make room. Same posture as the threshold-pairing rule above.
396+
380397
Critical-output completeness rule:
381398

382399
Before returning JSON, identify the plan's 1-3 most important viability claims. For each claim, ensure there is either:

experiments/napkin_math/.claude/skills/extract-parameters-from-full/system-prompt.txt

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ Hard limits:
1515
- Return at most 5 recommended_first_calculations.
1616
- Each comment must be at most 25 words.
1717
- Each source_text must be at most 20 words.
18+
- The 20-word cap is hard. If the source phrase exceeds 20 words, truncate to the subject-plus-threshold portion that names the variable and its bound, drop the consequence clause, and end with an ellipsis if mid-sentence. Do not paste an entire if/then conditional when only the antecedent's threshold portion is load-bearing for the key_value. Count words before emitting; if at or above the cap, shorten further.
1819
- Never output suggested_low, suggested_base, or suggested_high inside key_values.
1920
- Do not include values merely because they are present in the report.
2021
- Prefer missing-but-needed modelling values over minor explicit values.
@@ -311,6 +312,20 @@ Use generic plan-native ids only. Good neutral patterns include:
311312

312313
If the plan provides bounded inputs such as capacity, hours, rate, utilization, demand, staffing level, or overhead, do not leave those inputs dead-ended when they are needed to evaluate a stated coverage or capacity claim.
313314

315+
Threshold pairing rule:
316+
317+
When you extract a key_value that names a numeric threshold — a floor, cap, ceiling, minimum, maximum, target volume, target share, target deadline, or any other "must be at least X" / "must not exceed X" boundary the plan states — you MUST also emit a paired margin calculation comparing the realised quantity against the threshold. The pairing has three parts:
318+
319+
1. The threshold goes in key_values (you have already done this step).
320+
2. The realised quantity goes in missing_values_to_estimate if the source does not name it. The realised quantity is the variable the threshold tests against, not the threshold itself.
321+
3. The margin calculation goes in recommended_first_calculations or derived_questions, with a formula like `realised - threshold` (so positive = pass) when the threshold is a floor, or `threshold - realised` when the threshold is a cap. Name the output with the `_margin` or `_surplus` suffix.
322+
323+
Without the paired margin calc, the threshold is a dead-end variable: it is extracted, generate-bounds samples it, but the simulation never tests whether it is cleared. The threshold then contributes to bounds noise without contributing to any gate verdict.
324+
325+
When hard limits make the pairing feel tight, do NOT drop the pairing. Drop a less load-bearing key_value, or move a less-critical calculation to derived_questions, to make room. A threshold without its pairing fails the "no dead-end variables" rule and the "critical-output completeness rule" simultaneously.
326+
327+
Apply this rule even when the threshold is one of several stated by the plan. Every extracted threshold gets a pairing or the threshold is dropped from key_values. There is no third option.
328+
314329
Combined viability gate preservation:
315330

316331
When a plan frames a strategy as surviving, absorbing, balancing, covering, offsetting, or withstanding multiple pressures, shocks, gaps, constraints, or risks, preserve the highest-level combined viability test.
@@ -326,6 +341,22 @@ Neutral patterns:
326341
- combined_coverage_ratio = available_capacity / total_required_capacity
327342
- total_required_capacity = requirement_a + requirement_b
328343

344+
Source-arithmetic preservation rule:
345+
346+
When the source explicitly relates a dependent quantity to its named components through an arithmetic operation — sum, product, ratio, fraction of a base, scaled magnitude, weighted average — the extractor MUST preserve that relationship as a recommended_first_calculation. Do not declare the dependent quantity as a flat bounded variable when the source supplies both its components and the arithmetic between them. Sampling a derived quantity independently of the components it is derived from double-counts uncertainty, lets the simulation produce physically incoherent combinations (a low component with a high "total"), and obscures the cause of any sensitivity finding.
347+
348+
Three common patterns to recognise:
349+
350+
1. Aggregate sum. Apply this pattern ONLY when the source states or clearly implies that a total is computed from named constituents — the source says "the total is A + B + C", "broken down as named line items summing to X", or otherwise aggregates the named components into the stated total. In those cases, declare the total as `aggregate = sum_of_components`, with each component in `key_values` or `missing_values_to_estimate`, and the aggregate as a recommended_first_calculation. Do NOT apply this pattern to independent caps, ceilings, committed budgets, targets, or funding envelopes simply because allocations or line items appear nearby; those remain primitive thresholds in `key_values` and are tested via the threshold-pairing rule above (spend-vs-cap margin, not sum-vs-cap identity).
351+
352+
2. Burn rate × duration. The source names a per-period or per-unit rate (a value per unit of time, area, count, person, capacity) AND a separable duration or count the rate applies over. Declare the dependent total as `total = rate * duration`. The rate and the duration each go in `key_values` or `missing_values_to_estimate`; the total is the recommended_first_calculation. Do not bound the dependent total directly when the source supplies the two operands.
353+
354+
3. Explicit decomposition block. The source explicitly does the arithmetic itself — a base quantity, one or more share fractions, possibly a perturbation magnitude, and the named resulting dependent value. Preserve every named operand and the operation between them as a calculation. Do not collapse the source's explicit math into a single flat variable; doing so loses the source's stated sensitivity to each operand.
355+
356+
Discipline shared by all three patterns: every quantity for which the source itself supplies a formula is a derived quantity, not a primitive. Primitives go in `key_values` or `missing_values_to_estimate`; derived quantities go in `recommended_first_calculations` when the relation is critical to a viability claim, or in `derived_questions` when the relation is useful but secondary. The choice between those two containers is a placement decision; it does not relax the rule that the source's stated arithmetic must be preserved as a calculation rather than collapsed into a flat variable.
357+
358+
When hard limits make this preservation feel tight, do not collapse the decomposition. If the arithmetic relation is critical to a viability claim, keep it in `recommended_first_calculations`; if it is useful but secondary, `derived_questions` is the right home; if neither fits under cap pressure, drop a less load-bearing key_value to make room. Same posture as the threshold-pairing rule above.
359+
329360
Critical-output completeness rule:
330361

331362
Before returning JSON, identify the plan's 1-3 most important viability claims. For each claim, ensure there is either:

0 commit comments

Comments
 (0)