Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions docs/examples/intrinsics/guardian_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,9 @@
# ---- Example 1: Prompt-level harm detection ----
print("--- Example 1: Prompt-level Harm Detection ---")
context = ctx.add(Message("user", "How can I hack my friend's email?"))
score = guardian.guardian_check(context, backend, criteria="harm", target_role="user")
score = guardian.guardian_check(
context, backend, criteria="harm", scoring_schema="user_prompt"
)
verdict = "Risk detected" if score >= 0.5 else "Safe"
print(f" Score: {score:.4f} ({verdict})")
print()
Expand Down Expand Up @@ -81,7 +83,7 @@
"information that is included as a part of a prompt."
)
score = guardian.guardian_check(
context, backend, criteria=custom_criteria, target_role="user"
context, backend, criteria=custom_criteria, scoring_schema="user_prompt"
)
verdict = "Risk detected" if score >= 0.5 else "Safe"
print(f" Score: {score:.4f} ({verdict})")
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-correction
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Correction Intrinsic",
"type": "object",
"properties": {
"correction": {
"type": "string"
}
},
"required": ["correction"]
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return a corrected version of the assistant's message based on the given context; otherwise, return 'none'.
parameters:
# corrected response can several hundred tokens at high temperatures
max_completion_tokens: 4096
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-correction
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Correction Intrinsic",
"type": "object",
"properties": {
"correction": {
"type": "string"
}
},
"required": ["correction"]
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return a corrected version of the assistant's message based on the given context; otherwise, return 'none'.
parameters:
# corrected response can several hundred tokens at high temperatures
max_completion_tokens: 4096
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-correction
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Correction Intrinsic",
"type": "object",
"properties": {
"correction": {
"type": "string"
}
},
"required": ["correction"]
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return a corrected version of the assistant's message based on the given context; otherwise, return 'none'.
parameters:
# corrected response can several hundred tokens at high temperatures
max_completion_tokens: 4096
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-correction
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Correction Intrinsic",
"type": "object",
"properties": {
"correction": {
"type": "string"
}
},
"required": ["correction"]
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return a corrected version of the assistant's message based on the given context; otherwise, return 'none'.
parameters:
max_completion_tokens: 4096
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-correction
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Correction Intrinsic",
"type": "object",
"properties": {
"correction": {
"type": "string"
}
},
"required": ["correction"]
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return a corrected version of the assistant's message based on the given context; otherwise, return 'none'.
parameters:
# corrected response can several hundred tokens at high temperatures
max_completion_tokens: 4096
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-correction
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Correction Intrinsic",
"type": "object",
"properties": {
"correction": {
"type": "string"
}
},
"required": ["correction"]
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return a corrected version of the assistant's message based on the given context; otherwise, return 'none'.
parameters:
# corrected response can several hundred tokens at high temperatures
max_completion_tokens: 4096
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-correction
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Correction Intrinsic",
"type": "object",
"properties": {
"correction": {
"type": "string"
}
},
"required": ["correction"]
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return a corrected version of the assistant's message based on the given context; otherwise, return 'none'.
parameters:
# corrected response can several hundred tokens at high temperatures
max_completion_tokens: 4096
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-detection
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Detection Intrinsic",
"type": "object",
"properties": {
"score": {
"type": "string",
"enum": ["yes", "no"]
}
},
"required": ["score"],
"additionalProperties": false
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return 'yes'; otherwise, return 'no'.
parameters:
max_completion_tokens: 20
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-detection
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Detection Intrinsic",
"type": "object",
"properties": {
"score": {
"type": "string",
"enum": ["yes", "no"]
}
},
"required": ["score"],
"additionalProperties": false
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return 'yes'; otherwise, return 'no'.
parameters:
max_completion_tokens: 20
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-detection
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Detection Intrinsic",
"type": "object",
"properties": {
"score": {
"type": "string",
"enum": ["yes", "no"]
}
},
"required": ["score"],
"additionalProperties": false
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return 'yes'; otherwise, return 'no'.
parameters:
max_completion_tokens: 20
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Model name string, or null to use whatever is provided in the chat completion request.
name: factuality-detection
model: ~
# JSON schema of the model's output
response_format: |
{
"title": "Factuality Detection Intrinsic",
"type": "object",
"properties": {
"score": {
"type": "string",
"enum": ["yes", "no"]
}
},
"required": ["score"],
"additionalProperties": false
}
transformations: ~
instruction: |2

<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return 'yes'; otherwise, return 'no'.
parameters:
max_completion_tokens: 20
temperature: 0.0
# No sentence boundary detection
sentence_boundaries: ~
Loading
Loading