Skip to content

Enforce Target Novel Length in LLM Generation Pipeline #163

@CyberSecDef

Description

@CyberSecDef

Summary
The current novel generation pipeline does not reliably meet the user-specified target length. Generated outputs frequently fall short of the intended word count, resulting in incomplete narrative development, underdeveloped arcs, and reduced perceived quality. This issue proposes enhancements to prompt design, generation strategy, and validation mechanisms to ensure adherence to target length requirements.


Problem Statement
The system accepts a target_length input (e.g., total word count or per-chapter word count), but:

  • LLM outputs are often significantly shorter than requested.
  • Chapters may vary widely in length despite consistent targets.
  • The multi-pass agent pipeline (draft → dialog → scene → context → editing → structure → character → synthesis → polish) does not enforce cumulative length constraints.
  • No post-generation validation or corrective expansion is applied.

Result: Final manuscripts frequently fail to meet publishing expectations for “feature-length” novels.


Root Causes (Observed / Likely)

  1. LLM Output Bias

    • LLMs tend to optimize for brevity unless explicitly constrained.
    • Token limits and implicit summarization behavior reduce verbosity.
  2. Prompt Insufficiency

    • Current prompts request content but do not enforce minimum word counts with strong constraints.
    • Lack of structural requirements (e.g., scene counts, paragraph density).
  3. Pipeline Fragmentation

    • Each agent operates locally without awareness of global length goals.
    • No cumulative tracking of total word count across chapters.
  4. No Feedback Loop

    • No mechanism to detect under-length outputs and regenerate or expand.
    • No iterative refinement targeting length compliance.

Expected Behavior

  • Generated novels should meet or slightly exceed the specified target_length (±5% tolerance).
  • Chapter lengths should be consistent with defined per-chapter targets.
  • Narrative density (description, dialogue, internal monologue) should scale proportionally with length.

Proposed Solutions

1. Prompt Engineering Enhancements

Update all generation prompts (especially Draft agent) to include:

  • Explicit minimum word count:

    • Example:
      "Write a chapter of no fewer than 2,500 words. Do not summarize. Expand all scenes fully."
  • Structural constraints:

    • Minimum number of scenes (e.g., 3–5 per chapter)

    • Required components per scene:

      • Goal
      • Obstacle
      • Outcome
      • Transition
  • Expansion directives:

    • “Include sensory detail, internal thoughts, and dialogue in every scene.”
    • “Avoid skipping time unless explicitly required.”

2. Per-Chapter Length Targeting

Introduce derived constraints:

  • target_length / chapter_count = chapter_target
  • Enforce per-chapter minimum (e.g., 90% of chapter_target)

Update prompts dynamically:

  • "Target length for this chapter: 2,800–3,200 words"

3. Length Validation Layer

After each chapter generation:

  • Compute actual word count

  • If below threshold:

    • Trigger expansion pass

Example logic:

if word_count < min_threshold:
    trigger_expansion(chapter_text)

4. Expansion Pass (New Agent)

Introduce a dedicated Expansion Agent:

Responsibilities:

  • Increase length without altering plot

  • Add:

    • Descriptive detail
    • Dialogue depth
    • Internal monologue
    • Environmental context

Prompt Pattern:

  • “Expand the following chapter to at least X words. Do not summarize or remove content. Only add detail and depth.”

5. Iterative Generation Strategy

Instead of single-pass chapter generation:

  • Generate in segments:

    • Scene 1 → Scene 2 → Scene 3
  • Accumulate until target length reached

Benefits:

  • Better control over pacing
  • Natural expansion of narrative

6. Token Budget Management

Ensure model configuration supports longer outputs:

  • Increase max_tokens where applicable
  • Use streaming or chunked generation if limits are hit

7. Global Length Tracking

Maintain running total:

total_words_generated += chapter_word_count
remaining_words = target_length - total_words_generated

Adjust future chapter prompts dynamically:

  • Increase verbosity if behind target
  • Normalize if ahead

8. Post-Generation Audit

Final validation step:

  • Check total manuscript length

  • If under target:

    • Expand weakest chapters (shortest or least dense)

Acceptance Criteria

  • ≥95% of generated novels fall within ±5% of target length
  • No chapter is below 85% of its target length
  • Expansion pass successfully increases word count without degrading coherence
  • Narrative quality (measured manually or via heuristics) is preserved or improved

Implementation Notes

  • Changes primarily affect:

    • Prompt templates
    • Pipeline orchestration logic
    • Post-processing validation layer
  • Backward compatibility:

    • Default behavior remains unchanged if target_length is not provided

Priority
High — directly impacts core product quality and user satisfaction.


Labels
enhancement, llm, prompt-engineering, pipeline, quality-control

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions