Skip to content

Commit 4fb38ac

Browse files
committed
updates
1 parent 2c9c39a commit 4fb38ac

1 file changed

Lines changed: 25 additions & 24 deletions

File tree

fern/observability/evals-advanced.mdx

Lines changed: 25 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1090,37 +1090,38 @@ Build up complexity gradually:
10901090
## Summary
10911091
10921092
<Tip>
1093-
**Key takeaways for advanced eval testing:**
1093+
**Key takeaways for advanced eval testing:**
10941094
1095-
**Testing strategy:**
1095+
**Testing strategy:**
10961096
1097-
- Use smoke tests before comprehensive suites
1098-
- Build regression tests when fixing bugs
1099-
- Cover edge cases systematically
1097+
- Use smoke tests before comprehensive suites
1098+
- Build regression tests when fixing bugs
1099+
- Cover edge cases systematically
11001100
1101-
**Validation selection:**
1101+
**Validation selection:**
11021102
1103-
- Exact match for critical data
1104-
- Regex for pattern matching
1105-
- AI judge for semantic evaluation
1103+
- Exact match for critical data
1104+
- Regex for pattern matching
1105+
- AI judge for semantic evaluation
11061106
1107-
**Performance:**
1107+
**Performance:**
11081108
1109-
- Exit early on critical failures
1110-
- Keep conversations focused (5-10 turns)
1111-
- Batch related tests together
1109+
- Exit early on critical failures
1110+
- Keep conversations focused (5-10 turns)
1111+
- Batch related tests together
11121112
1113-
**Maintenance:**
1113+
**Maintenance:**
11141114
1115-
- Version control evaluations
1116-
- Review failures promptly
1117-
- Update tests with features
1118-
- Document test purpose clearly
1115+
- Version control evaluations
1116+
- Review failures promptly
1117+
- Update tests with features
1118+
- Document test purpose clearly
11191119
1120-
**CI/CD:**
1120+
**CI/CD:**
11211121
1122-
- Automate critical tests in pipelines
1123-
- Use staging for full suite validation
1124-
- Set quality gate thresholds
1125-
- Run regression suites regularly
1126-
</Tip>
1122+
- Automate critical tests in pipelines
1123+
- Use staging for full suite validation
1124+
- Set quality gate thresholds
1125+
- Run regression suites regularly
1126+
1127+
</Tip>

0 commit comments

Comments
 (0)