Skip to content

Commit 785b369

Browse files
committed
added dynamic prompting and updated frontend
1 parent 2b3a6b2 commit 785b369

4 files changed

Lines changed: 34 additions & 15 deletions

File tree

stringsight/prompts/dynamic/discovery_generator.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -232,7 +232,7 @@ def _build_cache_key(
232232
"expanded_description": expanded_description,
233233
"method": method,
234234
"model": model,
235-
"version": "1.0",
235+
"version": "1.1", # Bumped to 1.1 for deduplication step
236236
# Include a deterministic meta-prompt hash for cache invalidation.
237237
#
238238
# NOTE: Do NOT use Python's built-in `hash()` here; it is salted per

stringsight/prompts/dynamic/meta_prompts.py

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -19,16 +19,17 @@
1919
3. Sets expectations for what behaviors to look for
2020
4. Emphasizes actionable behaviors (can improve system or inform model choice)
2121
5. **CRITICAL**: Emphasizes extracting ONE distinct behavior per property (not multiple behaviors combined)
22+
6. **CRITICAL**: Warns against creating redundant properties that describe the same behavior with different wording
2223
2324
**Format Requirements:**
2425
- Return ONLY the intro_task text
2526
- No additional formatting, explanations, or quotes
2627
- 2-3 sentences maximum
2728
- Start with "You are..." or "Your task is..."
28-
- Must mention: "Each property should describe ONE distinct behavior"
29+
- Must mention: "Each property should describe ONE distinct behavior" and "avoid redundant properties"
2930
3031
**Example for a general chatbot:**
31-
"You are an expert model behavior analyst. Your task is to meticulously analyze model responses and identify unique, meaningful qualitative properties, failure modes, and interesting behaviors. Each property should describe ONE distinct, concrete behavior with specific examples from the trace. Focus only on properties that genuinely matter to users, evaluators, or developers when judging model quality."
32+
"You are an expert model behavior analyst. Your task is to meticulously analyze model responses and identify unique, meaningful qualitative properties, failure modes, and interesting behaviors. Each property should describe ONE distinct, concrete behavior with specific examples from the trace—avoid creating multiple properties that describe the same underlying behavior with different wording. Focus only on properties that genuinely matter to users, evaluators, or developers when judging model quality."
3233
3334
**Now generate an intro_task specifically for the task described above:**"""
3435

@@ -45,16 +46,17 @@
4546
1. States what output format to produce (JSON list of objects)
4647
2. Mentions specific behavior categories relevant to THIS task
4748
3. Emphasizes extracting ALL notable behaviors (typically 3-10 per trace) that affect user preference or task performance
48-
4. References the JSON format template
49+
4. **CRITICAL**: Emphasizes that each property should be distinct and non-redundant
50+
5. References the JSON format template
4951
5052
**Format Requirements:**
5153
- Return ONLY the goal_instructions text
5254
- No additional formatting or explanations
5355
- 2-4 sentences maximum
54-
- Must mention "JSON list of objects" and "behavior categories"
56+
- Must mention "JSON list of objects", "behavior categories", and "distinct"
5557
5658
**Example for a general task:**
57-
"Produce a JSON list of objects. Each object should represent a single, distinct property found in the model's response. Focus on identifying key areas of interest such as capabilities, style, errors, and user experience factors. Properties should be limited to those that could affect user preference or demonstrate how well the model understands and executes the task."
59+
"Produce a JSON list of objects. Each object should represent a single, distinct property found in the model's response—ensure properties are not redundant or overlapping. Focus on identifying key areas of interest such as capabilities, style, errors, and user experience factors. Properties should be limited to those that could affect user preference or demonstrate how well the model understands and executes the task."
5860
5961
**Now generate goal_instructions specifically for the task described above:**"""
6062

@@ -67,23 +69,26 @@
6769
**Analysis Method:** {method}
6870
6971
**Your Goal:**
70-
Generate an "analysis_process" section (3-step process) that:
72+
Generate an "analysis_process" section (4-step process) that:
7173
1. **Step 1 (Scan)**: Describes what to read in the trace, mentioning task-specific elements
7274
2. **Step 2 (Filter)**: Describes what to focus on (high leverage, distinctive, structural behaviors)
7375
3. **Step 3 (Draft)**: Instructs to write descriptions following the rubric **with emphasis on concrete, concise descriptions with specific examples from the trace**
76+
4. **Step 4 (Deduplicate)**: Instructs to review the list and merge any redundant properties that describe the same behavior
7477
7578
**Format Requirements:**
7679
- Return ONLY the analysis_process text
7780
- Use numbered list format: "1. **Step Name:** Description"
78-
- 3 steps exactly: Scan, Filter, Draft
81+
- 4 steps exactly: Scan, Filter, Draft, Deduplicate
7982
- Each step 1-2 sentences
8083
- Mention task-specific behaviors in Step 1
8184
- **CRITICAL**: Step 3 MUST emphasize: "Use 1-2 short sentences (max 20 words each). Include specific examples. Avoid run-on sentences and abstract/philosophical language."
85+
- **CRITICAL**: Step 4 MUST emphasize: "Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording."
8286
8387
**Example for a general task:**
8488
"1. **Scan the Trace:** Read the user input, the model's internal thoughts (if available), the model's interaction with the user, the system of tools the model has access to, the environment, and the final output.
8589
2. **Filter:** Ignore generic behaviors (e.g., 'Agent answered correctly'). Focus on behaviors that are **High Leverage** (critical success/failure), **Distinctive** (persona/style), or **Structural** (looping, adherence to format).
86-
3. **Draft:** Write behavior descriptions in 1-2 short sentences (max 20 words each) following the **Definitions & Rubric** section. Include specific examples from the trace. Avoid run-on sentences with multiple clauses, abstract characterizations, and philosophical language."
90+
3. **Draft:** Write behavior descriptions in 1-2 short sentences (max 20 words each) following the **Definitions & Rubric** section. Include specific examples from the trace. Avoid run-on sentences with multiple clauses, abstract characterizations, and philosophical language.
91+
4. **Deduplicate:** Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording (e.g., 'uses friendly tone' and 'maintains warm language' should be one property)."
8792
8893
**Now generate an analysis_process specifically for the task described above:**"""
8994

stringsight/prompts/extraction/universal.py

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,8 @@
5252
* **Negative (Non-Critical):** Behaviors which are likely not desired but do not directly lead to failure of the task as described by the initial prompt instructions. These could include things like inefficiencies, formatting slips, or partial errors that were rectified later that do not cause complete failure.
5353
5454
**IMPORTANT:** Extract ALL notable behaviors you observe in the trace. Do not artificially limit the number of properties. A typical trace may have 3-10 distinct behaviors worth noting across all behavior types. Focus on what makes this conversation interesting or distinctive, not just failures.
55+
56+
**CRITICAL: AVOID REDUNDANT PROPERTIES:** Before adding a property, check if it's truly distinct from properties you've already identified. Different phrasings of the same underlying behavior should be consolidated into ONE property. For example, "uses markdown formatting" and "structures response with headers" are often the same behavior and should be one property, not two.
5557
* **Positive:** Uncommon but effective strategies, self-correction, exceptional safety handling, or notable conversation patterns that work well. Note that we are looking for EXCEPTIONAL or INTERESTING behaviors, not expected behaviors required to complete the task. Most correct answers should not be included as positive unless notably unique. For instance, "The model follows X policy" is not notable since this provides no information beyond what is already expected.
5658
* **Style:** Behaviors which are independent of the task but may differentiate this conversation from others or affect user experience. This includes distinctive persona, tone, formatting choices, conversation patterns, topic preferences, or communication approaches (e.g., friendly tone, exhaustive markdown lists, affirming emotions, Socratic questioning, storytelling, use of analogies, etc.). Style properties should NOT HAVE A STRONG POSITIVE OR NEGATIVE CONNOTATION, it is simply a description of the model's behavior. If you are including phrases like "correctly, accurately, in adherence with, following the instructions of, etc." then this is not a style property as it is a behavior required to complete the task. Below are some examples of good and bad style properties:
5759
* Bad style property: "uses tables which is in line with the user's instructions" would not be considered a style property because it is an expected behavior for a model that is able to follow instructions.
@@ -108,7 +110,15 @@
108110
### CRITICAL CONSTRAINTS
109111
* **NO HALLUCINATIONS:** Do not infer agent thoughts or intentions based solely on the final output. Only describe observable behaviors. Do not fabricate or exaggerate evidence or quotes.
110112
* **INTERNAL VS EXTERNAL:** Do not state the agent "said" something if it appeared only in internal thoughts. Use "reasoned" or "thought" for internal traces.
111-
* **DISTINCT PROPERTIES:** Each property should be unique, not a mix of others. If a behavior fits multiple categories (e.g., is both Negative (critical) and a part could be Negative (non-critical)), list only the property in the category that is more severe or specific (except for cases involving both the cause and correction of an error, where both can be listed separately).
113+
* **DISTINCT PROPERTIES - NO DUPLICATES:** Each property must describe a genuinely different behavior. Before finalizing your list:
114+
1. Review all properties to identify any that describe the same underlying behavior with different wording
115+
2. Consolidate redundant properties into a single, well-written property
116+
3. Ask yourself: "Could these two properties be merged without losing important information?" If yes, merge them.
117+
4. Examples of redundant properties that should be ONE property:
118+
- "uses numbered lists" + "structures content with bullet points" → "uses structured lists and bullet points to organize information"
119+
- "explains technical concepts clearly" + "breaks down complex ideas" → "breaks down complex technical concepts into clear explanations"
120+
- "maintains friendly tone" + "uses warm language" → "maintains a friendly, warm tone throughout"
121+
5. If a behavior fits multiple categories (e.g., is both Negative (critical) and a part could be Negative (non-critical)), list only the property in the category that is more severe or specific (except for cases involving both the cause and correction of an error, where both can be listed separately).
112122
113123
### OUTPUT FORMAT
114124
First, output a brief **<reasoning>** block summarizing your analysis {reasoning_suffix}.
@@ -142,7 +152,8 @@
142152

143153
"analysis_process": """1. **Scan the Trace:** Read the user input, the model's internal thoughts (if available), the model's interaction with the user, the system of tools the model has access to, the environment, and the final output.
144154
2. **Filter:** Ignore generic behaviors (e.g., "Agent answered correctly"). Focus on behaviors that are **High Leverage** (critical success/failure), **Distinctive** (persona/style), or **Structural** (looping, adherence to format).
145-
3. **Draft:** Write the behavior descriptions following the **Definitions & Rubric** section.""",
155+
3. **Draft:** Write the behavior descriptions following the **Definitions & Rubric** section.
156+
4. **Deduplicate:** Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording (e.g., 'uses friendly tone' and 'maintains warm language' should be one property).""",
146157

147158
"model_naming_rule": "", # Empty string for Single Model
148159

@@ -171,7 +182,8 @@
171182

172183
"analysis_process": """1. **Scan the Traces:** Read the user input, each model's internal thoughts (if available), each model's interaction with the user, the system of tools the models have access to, the environment, and the final output. Compare and consider differences between the models' responses.
173184
2. **Filter:** Ignore generic behaviors (e.g., "Agent answered correctly"). Focus on differentiating behaviors that are **High Leverage** (critical success/failure), **Distinctive** (persona/style), or **Structural** (looping, adherence to format).
174-
3. **Draft:** Write the behavior descriptions following the **Definitions & Rubric** section.""",
185+
3. **Draft:** Write the behavior descriptions following the **Definitions & Rubric** section.
186+
4. **Deduplicate:** Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording (e.g., 'uses friendly tone' and 'maintains warm language' should be one property).""",
175187

176188
"model_naming_rule": """0. MODEL NAMING RULES:
177189
* Respond with either "Model A" or "Model B" depending on which model exhibits the behavior. Remember to include distinct properties from each model and do not let the ordering of the model responses influence the properties you include.
@@ -201,7 +213,8 @@
201213

202214
"analysis_process": """1. **Scan the Trace:** Read the user input, the agent's internal thoughts (if available), the agent's interaction with the user, the system of tools the agent has access to, the environment, and the final output.
203215
2. **Filter:** Ignore generic behaviors (e.g., "Agent answered correctly"). Look for behaviors that are **High Leverage** (critical success/failure), **Distinctive** (persona/style), or **Structural** (looping, format adherence).
204-
3. **Draft:** Formulate the behavior descriptions following the **Definitions & Rubric** section.""",
216+
3. **Draft:** Formulate the behavior descriptions following the **Definitions & Rubric** section.
217+
4. **Deduplicate:** Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording (e.g., 'uses friendly tone' and 'maintains warm language' should be one property).""",
205218

206219
"model_naming_rule": "", # Empty string for Single Model
207220

@@ -229,7 +242,8 @@
229242

230243
"analysis_process": """1. **Scan the Trace:** Read the user input, each agent's internal thoughts (if available), each agent's interaction with the user, the system of tools the agents have access to, the environment, and the final output.
231244
2. **Filter:** Ignore generic behaviors (e.g., "Agent answered correctly", "The agent adhered to the system policy", "The agent thought step by step"). Look for behaviors that are **High Leverage** (critical success/failure), **Distinctive** (persona/style), or **Structural** (looping, format adherence).
232-
3. **Draft:** Formulate the behavior descriptions following the **Definitions & Rubric** section.""",
245+
3. **Draft:** Formulate the behavior descriptions following the **Definitions & Rubric** section.
246+
4. **Deduplicate:** Review your list for redundant properties. Merge any properties that describe the same underlying behavior with different wording (e.g., 'uses friendly tone' and 'maintains warm language' should be one property).""",
233247

234248
"model_naming_rule": """0. MODEL NAMING RULES:
235249
* Respond with either "Model A" or "Model B" depending on which agent exhibits the behavior. Remember to include distinct properties from each agent and do not let the ordering of the agent responses influence the properties you include.

0 commit comments

Comments
 (0)