CyberSecDef
diff --git a/‎Novel_Processing_Instructions.md‎
Lines changed: 50 additions & 14 deletions b/‎Novel_Processing_Instructions.md‎
Lines changed: 50 additions & 14 deletions
diff --git a/‎novelforge/agents/chapter/__init__.py‎
Lines changed: 3 additions & 0 deletions b/‎novelforge/agents/chapter/__init__.py‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎novelforge/agents/chapter/_helpers.py‎
Lines changed: 157 additions & 0 deletions b/‎novelforge/agents/chapter/_helpers.py‎
Lines changed: 157 additions & 0 deletions
@@ -2,12 +2,13 @@
 
 ## Generation
 
-disregard all previous content in this conversation. please generate a new 1800  character story premise for a new popular <GENRE> fictional novel.  use no more than 1800 characters.  
+disregard all previous content in this conversation. please generate a new 1800  character story premise for a new popular scifi fictional novel.  use no more than 1800 characters.  
 
-<SUBJECT>
+A man appears in a small city claiming to be from a distant planet and willingly submits himself to a series of structured interviews with government officials, showing a calm certainty that challenges their assumptions. He demonstrates unusual knowledge and perception, but more notably, he takes clear enjoyment in the process—engaging deeply with each interviewer, asking his own questions, and forming meaningful, often transformative conversations with the people around him. As the officials attempt to determine whether he is delusional or something else entirely, his presence begins to influence their perspectives on reality, identity, and purpose, culminating in an unresolved departure that leaves his true nature ambiguous.
 
-it should concentrate on character development, <TOPIC>, <TOPIC>.  
-it should be action packed.  
+
+it should concentrate on character development, communication, the puzzle of whether or not he's really an alien or just a confused man.  
+it should be calm and cerebral.  
 
 Don’t use any names in the premise...just describe the characters and their roles
 
@@ -22,31 +23,66 @@ Don’t use any names in the premise...just describe the characters and their ro
 
 - Read the file "<NOVEL_PATH_MD>" and add to the context of this thread.  This is a novel written in chapters and there are chapter delineations present throughout.
 
-- our target is to make this a 9.5 / 10 book with around 85000 total words.  we are going to work through section 1 of the editors notes.  i need you to loop through each of the items in this section.  for each item, create a plan to resolve the issue.  validate that this is the best plan.  then state what you will be doing and execute your plan.  once you have executed, update the item's status in the editor's notes markdown file.  Check if later issues in the editors notes are also resolved with your actions and update accordingly.  Once you are done with the items updates and the editors notes, move on to the next item.  do this for all items in the section.  
-
-- we are now going to start adding sections to the editors notes markdown document of things that should be corrected in the next phase of edits.  make sure you have an up to date context of the novel in its current form.  our target is to make this a 9.5 / 10 book with around 85000 total words.  in this new section , document new items you feel should be executed to length and strengthen the novel.  We will have some pointed prompts following this to add targeted updates.
+- we are now going to start adding sections to the editors notes markdown document of things that should be corrected in the next phase of edits.  make sure you have an up to date context of the novel in its current form.  our target is to make this a 9.5 / 10 book with with atleast 85000 total words.  in this new section , document new items you feel should be executed to lengthen and strengthen the novel.  We will have some pointed prompts following this to add targeted updates.
 
 - Character Voice Differentiation
-our target is to make this a 9.5 / 10 book with around 85000 total words. Each POV character should think in a distinct internal language shaped by their background, demographics and expertise.   A 16 year old should think and talk like a 16 year old.  An old man should think and talk like an old man.  Look through the novel and find any dialog that doesnt match the speaking character.  create a plan to update these voices and add that to a new section in the editors notes markdown file.
+our target is to make this a 9.5 / 10 book with with atleast 85000 total words. Each POV character should think in a distinct internal language shaped by their background, demographics and expertise.   A 16 year old should think and talk like a 16 year old.  An old man should think and talk like an old man.  Look through the novel and find any dialog that doesnt match the speaking character.  create a plan to update these voices and add that to a new section in the editors notes markdown file.
 
 - Dialogue Naturalization
-our target is to make this a 9.5 / 10 book with around 85000 total words. Make sure the current dialogue isnt too clean, too functional, too information-delivery. Characters sometimes have incomplete thoughts, don't always speak in well-formed sentences, and sometimes rarely interrupt each other or themselves.  make sure the dialog in the novel reads this way.  create a plan to update these voices and add that to a new section in the editors notes markdown file.
+our target is to make this a 9.5 / 10 book with with atleast 85000 total words. Make sure the current dialogue isnt too clean, too functional, too information-delivery. Characters sometimes have incomplete thoughts, don't always speak in well-formed sentences, and sometimes rarely interrupt each other or themselves.  make sure the dialog in the novel reads this way.  create a plan to update these voices and add that to a new section in the editors notes markdown file.
 
 - Humor, Strangeness, and the Unexpected
-our target is to make this a 9.5 / 10 book with around 85000 total words.Real characters deflect, joke badly, notice irrelevant things, and occasionally do something that doesn't serve the plot.  create a plan to inject these odities throughout the novel.  1-2 oddities per chapter.
+our target is to make this a 9.5 / 10 book with with atleast 85000 total words.Real characters deflect, joke badly, notice irrelevant things, and occasionally do something that doesn't serve the plot.  create a plan to inject these odities throughout the novel.  1-2 oddities per chapter.
 
 - Prose Texture Variation
-our target is to make this a 9.5 / 10 book with around 85000 total words.Make sure the  prose has a varying literary density throughout. It should breathe -- denser in reflective moments, sparser in action, occasionally raw or clumsy when characters are overwhelmed. create a plan to update the prose and add that to a new section in the editors notes markdown file.
+our target is to make this a 9.5 / 10 book with with atleast 85000 total words.Make sure the  prose has a varying literary density throughout. It should breathe -- denser in reflective moments, sparser in action, occasionally raw or clumsy when characters are overwhelmed. create a plan to update the prose and add that to a new section in the editors notes markdown file.
 
 - metaphors
-our target is to make this a 9.5 / 10 book with around 85000 total words.Make sure the text doesn't go overboard with metaphors.  create a plan to remove uneeded ones and add to a new section in the editors notes markdown.
+our target is to make this a 9.5 / 10 book with with atleast 85000 total words.Make sure the text doesn't go overboard with metaphors.  create a plan to remove uneeded ones and add to a new section in the editors notes markdown.
+
 
-- our target is to make this a 9.5 / 10 book with around 85000 total words.  we are now going to work through the new sections.  start with section 11.  i need you to loop through each of the items in this section.  for each item, create a plan to resolve the issue.  validate that this is the best plan.  then state what you will be doing and execute your plan.  once you have executed, update the item's status in the editor's notes markdown file.  if later issues in the editors notes are also resolved with your actions, update accordingly.  then move on to the next item.  do this for all items in the section.   if this requires multiple subagents, execute those without requesting permission.  
+- we are now going to work through the  sections.  start with section 1. our target is to make this a 9.5 / 10 book with with atleast 85000 total words.    i need you to loop through each of the items in this section.  for each item, create a plan to resolve the issue.  validate that this is the best plan.  then state what you will be doing and execute your plan.  once you have executed, update the item's status in the editor's notes markdown file.  if later issues in the editors notes are also resolved with your actions, update accordingly.  then move on to the next item.  do this for all items in the section.   if this requires multiple subagents, execute those without requesting permission.  
 
-- Does this novel read like it has a soul?  or is it more like a flat instruction manual.  is this novel ready for initial publishing?
+- Does this novel read like it has a soul?  or is it more like a flat instruction manual.  is this novel ready for initial publishing?  how would you rate it on a scale of 1-10?
 
 - check if there are any gaps or rough scene cuts that are a result from all the edits
 
 - Please do a light copy-edit pass targeting prose repetitions.  Also remove unneeded dashes, em-dashes and hyphens.
 
 - please add a writing statistics section to the editors notes .md file.  please include: total words in the novel, average number of words per chapter, and then a formatted list of chapter numbers, names and words in that chapter.  freshen up the sections in the editors notes file if needed.
+
+
+## Styles
+
+A paperback novel is structured into three main sections: 
+Front Matter (preliminary pages), Body Matter (the story), and Back Matter (supplemental content). 
+Essential elements include a title page, copyright page, chapters, and usually an About the Author section. 
+Proper sequencing ensures professional, readable formatting for publication.
+
+Front Matter (Before the Story)
+	Half-Title Page: Contains only the book title.
+	Title Page: Includes the title, subtitle, author name, and publisher.
+	Copyright Page: Details legal info, publication year, ISBN, and rights.
+	Dedication: A brief personal note from the author.
+	Table of Contents: List of chapters and sections.
+	Epigraph: A short, thematic quote or poem.
+	Foreword/Preface/Acknowledgments: Optional sections providing context or thanking supporters.
+	Prologue: An opening scene setting the stage for fiction.
+
+Body Matter (The Story)
+	Chapters: The main content, divided into segments.
+	Epilogue: A concluding scene after the main story.
+
+Back Matter (After the Story)
+	About the Author: A short biography.
+	Acknowledgments: (If not in the front matter) Recognizes those who helped create the book.
+	Appendix/Glossary: Additional information or definitions (more common in nonfiction).
+	Bibliography: Sources used.
+
+Cover Elements
+	Front Cover: Title, author, illustration.
+	Back Cover: Synopsis, endorsements, and bio
+
+
+## PDF Generation
+pandoc SOURCE.md -o DEST.pdf --pdf-engine=weasyprint --pdf-engine-opt=--verbose --toc --standalone > output.txt 2>&1
@@ -32,6 +32,7 @@
     _sanitize_for_content_policy,
     check_chapter_length,
     expand_chapter,
+    extract_named_characters,
     format_vocabulary_rules,
     get_forbidden_words,
     get_soft_limited_words,
@@ -59,6 +60,7 @@
     build_chapter_summary_prompt,
     build_character_agent_prompt,
     build_character_field_repair_prompt,
+    build_character_reconciliation_prompt,
     build_character_relationship_prompt,
     build_character_resolution_validator_prompt,
     build_chapter_pattern_extractor_prompt,
@@ -100,6 +102,7 @@
     _run_all_chapter_agents,
     run_chapter_pattern_extractor,
     run_chapter_rhythm_classifier,
+    run_character_reconciliation,
     run_character_state_updater,
     run_continuity_gatekeeper,
     run_per_chapter_compression_check,
 
@@ -1,5 +1,6 @@
 """Shared constants, pass-failure helpers, content-policy retry, vocabulary scanning, and length enforcement."""
 
+import difflib
 import logging
 import re
 from collections.abc import Callable
@@ -397,6 +398,162 @@ def scan_vocabulary_overuse(chapter_text: str, genre: str = "") -> list[str]:
     return warnings
 
 
+# ---------------------------------------------------------------------------
+# Named-character detection (for reconciliation against the canonical roster)
+# ---------------------------------------------------------------------------
+
+# Common capitalized English words that are NOT character names. Used to
+# filter sentence-initial and conventional capitalization out of the
+# named-character scanner. Roster-token matching is applied BEFORE this
+# filter, so a character legitimately named "May" or "Crown" is still
+# detected correctly — the stop list only catches spans that have no
+# roster hit.
+_NAMED_CHARACTER_STOP_WORDS: frozenset[str] = frozenset({
+    # Pronouns / sentence-initial
+    "i", "he", "she", "they", "it", "we", "you", "me", "him", "her", "them", "us",
+    "his", "hers", "theirs", "its", "ours", "yours", "mine",
+    "this", "that", "these", "those", "there", "here",
+    "then", "when", "where", "why", "how", "what", "who", "whose", "which",
+    # Conjunctions / modifiers
+    "the", "a", "an", "and", "or", "but", "so", "yet", "as", "if", "while",
+    "because", "since", "although", "though", "unless", "until",
+    "not", "never", "always", "still", "only", "even", "also",
+    "now", "before", "after", "later", "soon", "ago", "once", "twice",
+    "yes", "no", "ok", "okay",
+    # Days
+    "monday", "tuesday", "wednesday", "thursday", "friday", "saturday", "sunday",
+    # Months (excluding May — often a character name; roster check handles it)
+    "january", "february", "march", "april", "june", "july",
+    "august", "september", "october", "november", "december",
+    # Honorifics / titles that commonly appear alone
+    "mr", "mrs", "ms", "dr", "sir", "madam", "lord", "lady",
+    "captain", "lieutenant", "sergeant", "major", "colonel", "general",
+    "professor", "father", "mother", "sister", "brother", "uncle", "aunt",
+    "detective", "inspector", "officer", "commander", "admiral", "chief",
+    "doctor", "nurse", "reverend", "pastor",
+    # Structural / narrative
+    "chapter", "book", "part", "act", "scene", "volume", "prologue", "epilogue",
+    # Exclamations / religious references
+    "god", "christ", "jesus", "heaven", "hell", "lord",
+    # Greetings / filler
+    "hello", "goodbye", "thanks", "please",
+    # Cardinal directions / generic place words
+    "north", "south", "east", "west", "street", "road", "avenue", "place",
+    "square", "city", "town", "village", "county", "state", "country",
+})
+
+
+# Candidate-name regex: one to three adjacent capitalized tokens.
+# Matches "Sarah", "Sarah Miller", "John Fitzgerald Kennedy" but does not
+# span apostrophes, hyphens, or punctuation — so "Sarah's" yields "Sarah".
+_NAME_CANDIDATE_RE = re.compile(r"\b[A-Z][a-z]+(?:\s+[A-Z][a-z]+){0,2}\b")
+
+
+def _roster_name_tokens(roster: list[dict]) -> set[str]:
+    """Return the set of lowercase name tokens from a character roster.
+
+    Each character's ``name`` is split on whitespace; tokens shorter than
+    two characters are discarded (they match too many false positives under
+    fuzzy matching).
+    """
+    tokens: set[str] = set()
+    for ch in roster or []:
+        if not isinstance(ch, dict):
+            continue
+        name = str(ch.get("name", "")).strip()
+        if not name:
+            continue
+        for tok in name.split():
+            tok_clean = tok.strip(".,;:'\"").lower()
+            if len(tok_clean) >= 2:
+                tokens.add(tok_clean)
+    return tokens
+
+
+def extract_named_characters(
+    chapter_text: str,
+    roster: list[dict],
+    *,
+    min_mentions: int = 2,
+    fuzzy_cutoff: float = 0.85,
+) -> dict:
+    """Detect named characters in chapter prose and classify them against *roster*.
+
+    Pure Python — no LLM call. Uses a capitalized-span regex, a stop-word
+    filter, and :func:`difflib.get_close_matches` for variant detection.
+
+    Parameters
+    ----------
+    chapter_text:  The chapter prose to scan.
+    roster:        The canonical ``character_list`` (list of dicts with
+                   a ``name`` key).
+    min_mentions:  Minimum distinct mentions required before a capitalized
+                   span is reported as an unknown character. Spans that
+                   appear fewer times are treated as likely sentence-initial
+                   false positives or throwaway walk-ons.
+    fuzzy_cutoff:  :mod:`difflib` similarity threshold for variant matching.
+                   Higher = stricter. 0.85 catches typos and short
+                   diminutives without conflating distinct names.
+
+    Returns
+    -------
+    dict with three keys:
+        ``known``:    sorted list of capitalized spans that intersect the
+                      roster's name tokens (for diagnostic logging).
+        ``unknown``:  list of ``(prose_name, count)`` tuples for names with
+                      no roster match and at least *min_mentions* occurrences,
+                      ordered by descending count.
+        ``variants``: list of ``(prose_name, roster_token, count)`` tuples
+                      — likely misspellings or diminutives of roster names.
+    """
+    tokens = _roster_name_tokens(roster)
+
+    raw_counts: dict[str, int] = {}
+    for m in _NAME_CANDIDATE_RE.finditer(chapter_text):
+        raw_counts[m.group()] = raw_counts.get(m.group(), 0) + 1
+
+    known: set[str] = set()
+    unknown_counts: dict[str, int] = {}
+    for span, count in raw_counts.items():
+        span_tokens = [t.lower() for t in span.split()]
+        # Roster check first: a span whose any token matches a roster token
+        # is a known character, regardless of stop-word overlap.
+        if tokens and any(t in tokens for t in span_tokens):
+            known.add(span)
+            continue
+        # Drop spans whose every token is a stop word (sentence-initial
+        # noise, honorifics with no name attached, etc.).
+        if all(t in _NAMED_CHARACTER_STOP_WORDS for t in span_tokens):
+            continue
+        if count < min_mentions:
+            continue
+        unknown_counts[span] = count
+
+    variants: list[tuple[str, str, int]] = []
+    unknowns: list[tuple[str, int]] = []
+    roster_token_list = sorted(tokens)
+    for span, count in sorted(unknown_counts.items(), key=lambda kv: (-kv[1], kv[0])):
+        match_found: str | None = None
+        if roster_token_list:
+            for t in span.split():
+                close = difflib.get_close_matches(
+                    t.lower(), roster_token_list, n=1, cutoff=fuzzy_cutoff,
+                )
+                if close:
+                    match_found = close[0]
+                    break
+        if match_found is not None:
+            variants.append((span, match_found, count))
+        else:
+            unknowns.append((span, count))
+
+    return {
+        "known": sorted(known),
+        "unknown": unknowns,
+        "variants": variants,
+    }
+
+
 # ---------------------------------------------------------------------------
 # Chapter length enforcement
 # ---------------------------------------------------------------------------