added dynamic prompting and updated frontend

lisadunlap · lisadunlap · commit 785b36913eb8 · 2026-01-13T22:44:02.000Z
diff --git a/frontend b/frontend
@@ -1 +1 @@
-Subproject commit 93ce3e24f8a7790fcbe2ab69df7930756c977d57
+Subproject commit 5d68e0e96f39636e559c903eaee464c328b12442
diff --git a/stringsight/prompts/dynamic/discovery_generator.py b/stringsight/prompts/dynamic/discovery_generator.py
@@ -232,7 +232,7 @@ def _build_cache_key(
             "expanded_description": expanded_description,
             "method": method,
             "model": model,
-            "version": "1.0",
+            "version": "1.1",  # Bumped to 1.1 for deduplication step
             # Include a deterministic meta-prompt hash for cache invalidation.
             #
             # NOTE: Do NOT use Python's built-in `hash()` here; it is salted per
diff --git a/stringsight/prompts/dynamic/meta_prompts.py b/stringsight/prompts/dynamic/meta_prompts.py
@@ -19,16 +19,17 @@
 3. Sets expectations for what behaviors to look for
 4. Emphasizes actionable behaviors (can improve system or inform model choice)
 5. **CRITICAL**: Emphasizes extracting ONE distinct behavior per property (not multiple behaviors combined)
+6. **CRITICAL**: Warns against creating redundant properties that describe the same behavior with different wording
 
 **Format Requirements:**
 - Return ONLY the intro_task text
 - No additional formatting, explanations, or quotes
 - 2-3 sentences maximum
 - Start with "You are..." or "Your task is..."
-- Must mention: "Each property should describe ONE distinct behavior"
+- Must mention: "Each property should describe ONE distinct behavior" and "avoid redundant properties"
 
 **Example for a general chatbot:**
-"You are an expert model behavior analyst. Your task is to meticulously analyze model responses and identify unique, meaningful qualitative properties, failure modes, and interesting behaviors. Each property should describe ONE distinct, concrete behavior with specific examples from the trace. Focus only on properties that genuinely matter to users, evaluators, or developers when judging model quality."
+"You are an expert model behavior analyst. Your task is to meticulously analyze model responses and identify unique, meaningful qualitative properties, failure modes, and interesting behaviors. Each property should describe ONE distinct, concrete behavior with specific examples from the trace—avoid creating multiple properties that describe the same underlying behavior with different wording. Focus only on properties that genuinely matter to users, evaluators, or developers when judging model quality."
 
 **Now generate an intro_task specifically for the task described above:**"""
 
@@ -45,16 +46,17 @@
 1. States what output format to produce (JSON list of objects)
 2. Mentions specific behavior categories relevant to THIS task
 3. Emphasizes extracting ALL notable behaviors (typically 3-10 per trace) that affect user preference or task performance
-4. References the JSON format template
+4. **CRITICAL**: Emphasizes that each property should be distinct and non-redundant
+5. References the JSON format template
 
 **Format Requirements:**
 - Return ONLY the goal_instructions text
 - No additional formatting or explanations
 - 2-4 sentences maximum
-- Must mention "JSON list of objects" and "behavior categories"
+- Must mention "JSON list of objects", "behavior categories", and "distinct"
 
 **Example for a general task:**
-"Produce a JSON list of objects. Each object should represent a single, distinct property found in the model's response. Focus on identifying key areas of interest such as capabilities, style, errors, and user experience factors. Properties should be limited to those that could affect user preference or demonstrate how well the model understands and executes the task."
+"Produce a JSON list of objects. Each object should represent a single, distinct property found in the model's response—ensure properties are not redundant or overlapping. Focus on identifying key areas of interest such as capabilities, style, errors, and user experience factors. Properties should be limited to those that could affect user preference or demonstrate how well the model understands and executes the task."
 
 **Now generate goal_instructions specifically for the task described above:**"""
 
@@ -67,23 +69,26 @@
 **Analysis Method:** {method}
 
 **Your Goal:**
-Generate an "analysis_process" section (3-step process) that:
+Generate an "analysis_process" section (4-step process) that:
 1. **Step 1 (Scan)**: Describes what to read in the trace, mentioning task-specific elements
 2. **Step 2 (Filter)**: Describes what to focus on (high leverage, distinctive, structural behaviors)
 3. **Step 3 (Draft)**: Instructs to write descriptions following the rubric **with emphasis on concrete, concise descriptions with specific examples from the trace**
+4. **Step 4 (Deduplicate)**: Instructs to review the list and merge any redundant properties that describe the same behavior
 
 **Format Requirements:**
 - Return ONLY the analysis_process text
 - Use numbered list format: "1. **Step Name:** Description"
-- 3 steps exactly: Scan, Filter, Draft
+- 4 steps exactly: Scan, Filter, Draft, Deduplicate
 - Each step 1-2 sentences
 - Mention task-specific behaviors in Step 1
 - **CRITICAL**: Step 3 MUST emphasize: "Use 1-2 short sentences (max 20 words each). Include specific examples. Avoid run-on sentences and abstract/philosophical language."
+- **CRITICAL**: Step 4 MUST emphasize: "Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording."
 
 **Example for a general task:**
 "1. **Scan the Trace:** Read the user input, the model's internal thoughts (if available), the model's interaction with the user, the system of tools the model has access to, the environment, and the final output.
 2. **Filter:** Ignore generic behaviors (e.g., 'Agent answered correctly'). Focus on behaviors that are **High Leverage** (critical success/failure), **Distinctive** (persona/style), or **Structural** (looping, adherence to format).
-3. **Draft:** Write behavior descriptions in 1-2 short sentences (max 20 words each) following the **Definitions & Rubric** section. Include specific examples from the trace. Avoid run-on sentences with multiple clauses, abstract characterizations, and philosophical language."
+3. **Draft:** Write behavior descriptions in 1-2 short sentences (max 20 words each) following the **Definitions & Rubric** section. Include specific examples from the trace. Avoid run-on sentences with multiple clauses, abstract characterizations, and philosophical language.
+4. **Deduplicate:** Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording (e.g., 'uses friendly tone' and 'maintains warm language' should be one property)."
 
 **Now generate an analysis_process specifically for the task described above:**"""
 
diff --git a/stringsight/prompts/extraction/universal.py b/stringsight/prompts/extraction/universal.py
@@ -52,6 +52,8 @@
 * **Negative (Non-Critical):** Behaviors which are likely not desired but do not directly lead to failure of the task as described by the initial prompt instructions. These could include things like inefficiencies, formatting slips, or partial errors that were rectified later that do not cause complete failure.
 
 **IMPORTANT:** Extract ALL notable behaviors you observe in the trace. Do not artificially limit the number of properties. A typical trace may have 3-10 distinct behaviors worth noting across all behavior types. Focus on what makes this conversation interesting or distinctive, not just failures.
+
+**CRITICAL: AVOID REDUNDANT PROPERTIES:** Before adding a property, check if it's truly distinct from properties you've already identified. Different phrasings of the same underlying behavior should be consolidated into ONE property. For example, "uses markdown formatting" and "structures response with headers" are often the same behavior and should be one property, not two.
 * **Positive:** Uncommon but effective strategies, self-correction, exceptional safety handling, or notable conversation patterns that work well. Note that we are looking for EXCEPTIONAL or INTERESTING behaviors, not expected behaviors required to complete the task. Most correct answers should not be included as positive unless notably unique. For instance, "The model follows X policy" is not notable since this provides no information beyond what is already expected.
 * **Style:** Behaviors which are independent of the task but may differentiate this conversation from others or affect user experience. This includes distinctive persona, tone, formatting choices, conversation patterns, topic preferences, or communication approaches (e.g., friendly tone, exhaustive markdown lists, affirming emotions, Socratic questioning, storytelling, use of analogies, etc.). Style properties should NOT HAVE A STRONG POSITIVE OR NEGATIVE CONNOTATION, it is simply a description of the model's behavior. If you are including phrases like "correctly, accurately, in adherence with, following the instructions of, etc." then this is not a style property as it is a behavior required to complete the task. Below are some examples of good and bad style properties:
   * Bad style property: "uses tables which is in line with the user's instructions" would not be considered a style property because it is an expected behavior for a model that is able to follow instructions.
@@ -108,7 +110,15 @@
 ### CRITICAL CONSTRAINTS
 * **NO HALLUCINATIONS:** Do not infer agent thoughts or intentions based solely on the final output. Only describe observable behaviors. Do not fabricate or exaggerate evidence or quotes.
 * **INTERNAL VS EXTERNAL:** Do not state the agent "said" something if it appeared only in internal thoughts. Use "reasoned" or "thought" for internal traces.
-* **DISTINCT PROPERTIES:** Each property should be unique, not a mix of others. If a behavior fits multiple categories (e.g., is both Negative (critical) and a part could be Negative (non-critical)), list only the property in the category that is more severe or specific (except for cases involving both the cause and correction of an error, where both can be listed separately).
+* **DISTINCT PROPERTIES - NO DUPLICATES:** Each property must describe a genuinely different behavior. Before finalizing your list:
+  1. Review all properties to identify any that describe the same underlying behavior with different wording
+  2. Consolidate redundant properties into a single, well-written property
+  3. Ask yourself: "Could these two properties be merged without losing important information?" If yes, merge them.
+  4. Examples of redundant properties that should be ONE property:
+     - "uses numbered lists" + "structures content with bullet points" → "uses structured lists and bullet points to organize information"
+     - "explains technical concepts clearly" + "breaks down complex ideas" → "breaks down complex technical concepts into clear explanations"
+     - "maintains friendly tone" + "uses warm language" → "maintains a friendly, warm tone throughout"
+  5. If a behavior fits multiple categories (e.g., is both Negative (critical) and a part could be Negative (non-critical)), list only the property in the category that is more severe or specific (except for cases involving both the cause and correction of an error, where both can be listed separately).
 
 ### OUTPUT FORMAT
 First, output a brief **<reasoning>** block summarizing your analysis {reasoning_suffix}.
@@ -142,7 +152,8 @@
 
     "analysis_process": """1. **Scan the Trace:** Read the user input, the model's internal thoughts (if available), the model's interaction with the user, the system of tools the model has access to, the environment, and the final output.
 2. **Filter:** Ignore generic behaviors (e.g., "Agent answered correctly"). Focus on behaviors that are **High Leverage** (critical success/failure), **Distinctive** (persona/style), or **Structural** (looping, adherence to format).
-3. **Draft:** Write the behavior descriptions following the **Definitions & Rubric** section.""",
+3. **Draft:** Write the behavior descriptions following the **Definitions & Rubric** section.
+4. **Deduplicate:** Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording (e.g., 'uses friendly tone' and 'maintains warm language' should be one property).""",
 
     "model_naming_rule": "",  # Empty string for Single Model
     
@@ -171,7 +182,8 @@
 
     "analysis_process": """1. **Scan the Traces:** Read the user input, each model's internal thoughts (if available), each model's interaction with the user, the system of tools the models have access to, the environment, and the final output. Compare and consider differences between the models' responses.
 2. **Filter:** Ignore generic behaviors (e.g., "Agent answered correctly"). Focus on differentiating behaviors that are **High Leverage** (critical success/failure), **Distinctive** (persona/style), or **Structural** (looping, adherence to format).
-3. **Draft:** Write the behavior descriptions following the **Definitions & Rubric** section.""",
+3. **Draft:** Write the behavior descriptions following the **Definitions & Rubric** section.
+4. **Deduplicate:** Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording (e.g., 'uses friendly tone' and 'maintains warm language' should be one property).""",
 
     "model_naming_rule": """0. MODEL NAMING RULES:
 * Respond with either "Model A" or "Model B" depending on which model exhibits the behavior. Remember to include distinct properties from each model and do not let the ordering of the model responses influence the properties you include.
@@ -201,7 +213,8 @@
 
     "analysis_process": """1. **Scan the Trace:** Read the user input, the agent's internal thoughts (if available), the agent's interaction with the user, the system of tools the agent has access to, the environment, and the final output.
 2. **Filter:** Ignore generic behaviors (e.g., "Agent answered correctly"). Look for behaviors that are **High Leverage** (critical success/failure), **Distinctive** (persona/style), or **Structural** (looping, format adherence).
-3. **Draft:** Formulate the behavior descriptions following the **Definitions & Rubric** section.""",
+3. **Draft:** Formulate the behavior descriptions following the **Definitions & Rubric** section.
+4. **Deduplicate:** Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording (e.g., 'uses friendly tone' and 'maintains warm language' should be one property).""",
 
     "model_naming_rule": "",  # Empty string for Single Model
     
@@ -229,7 +242,8 @@
 
     "analysis_process": """1. **Scan the Trace:** Read the user input, each agent's internal thoughts (if available), each agent's interaction with the user, the system of tools the agents have access to, the environment, and the final output.
 2. **Filter:** Ignore generic behaviors (e.g., "Agent answered correctly", "The agent adhered to the system policy", "The agent thought step by step"). Look for behaviors that are **High Leverage** (critical success/failure), **Distinctive** (persona/style), or **Structural** (looping, format adherence).
-3. **Draft:** Formulate the behavior descriptions following the **Definitions & Rubric** section.""",
+3. **Draft:** Formulate the behavior descriptions following the **Definitions & Rubric** section.
+4. **Deduplicate:** Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording (e.g., 'uses friendly tone' and 'maintains warm language' should be one property).""",
 
     "model_naming_rule": """0. MODEL NAMING RULES:
 * Respond with either "Model A" or "Model B" depending on which agent exhibits the behavior. Remember to include distinct properties from each agent and do not let the ordering of the agent responses influence the properties you include.

Original file line number	Diff line number	Diff line change
`@@ -232,7 +232,7 @@ def _build_cache_key(`
`232`	`232`	`"expanded_description": expanded_description,`
`233`	`233`	`"method": method,`
`234`	`234`	`"model": model,`
`235`		`- "version": "1.0",`
	`235`	`+ "version": "1.1", # Bumped to 1.1 for deduplication step`
`236`	`236`	`# Include a deterministic meta-prompt hash for cache invalidation.`
`237`	`237`	`#`
`238`	`238`	# NOTE: Do NOT use Python's built-in `hash()` here; it is salted per