You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
agenteval-replay — deterministic replay of captured agent interactions
agenteval-mutation — mutation testing for evaluation robustness
agenteval-fingerprint — capability fingerprinting of models under test
Cost metrics and root-cause analysis helpers
Smoke test coverage for agenteval-langchain4j (11 tests) and agenteval-spring-ai (12 tests)
Chaos module tests for LatencyInjector, SchemaMutationInjector, and ResilienceEvaluator (22 tests)
AUDIT.md — a full audit report of the library with severity-ranked findings
Changed
Gradle root build removed; Gradle is now scoped to agenteval-gradle-plugin only
(the module that must be Gradle-native for publishPlugins to the Gradle Plugin
Portal). Maven is the authoritative build for all 22 other modules, so the two
module lists can no longer drift
DatasetVersionerTest no longer relies on Thread.sleep for timestamp ordering;
it explicitly sets file modification times for deterministic assertions
agenteval-bom now documents why build-tooling modules are intentionally omitted
Bumped dependency versions via Dependabot: Jackson (to 2.21.x via BOM),
Logback 1.5.32, Spring AI 1.1.4, LangGraph4j (latest), Mockito 5.23.0, and
GitHub Actions (actions/checkout@v6, actions/setup-node@v6, actions/upload-artifact@v7, actions/upload-pages-artifact@v5, actions/deploy-pages@v5, actions/stale@v10, dorny/test-reporter@v3)
Test fixtures now use neutral API key strings (fake-key-for-tests, fake-ant-key-for-tests) instead of sk-test / sk-ant-test so credential
scanners do not match on shape
Deprecated
org.byteveda.agenteval.metrics.llm.PromptTemplate — use org.byteveda.agenteval.core.template.PromptTemplate instead.
Scheduled for removal in 1.0.0.
SemanticSimilarityMetric.cosineSimilarity(List, List) — use VectorMath.cosineSimilarity instead. Scheduled for removal in 1.0.0.
Fixed
MDX parsing errors in the documentation site, plus a PR build check to
catch future regressions (#68, #69)
JunitXmlReporter now configures DocumentBuilderFactory with full XXE
defenses (disallow-doctype-decl, external entity/DTD disabling, setXIncludeAware(false), setExpandEntityReferences(false))
YamlDatasetLoader now caps alias expansion (≤50), nesting depth (≤50),
and code points (≤3 MiB) and disallows duplicate/recursive keys — defense
in depth on top of SnakeYAML 2.x's default SafeConstructor
SpotBugs suppressions narrowed from broad regex patterns
(~...datasets.json.Json.*, ~...datasets.version..*) to explicit <Or><Class .../></Or> enumerations so new classes in those packages
surface genuine findings instead of being blanket-suppressed
Security
XXE hardening in JunitXmlReporter (agenteval-reporting)
YAML resource-exhaustion hardening in YamlDatasetLoader (agenteval-datasets)
.gitignore now covers common secret patterns (.env*, *.jks, *.keystore, *.p12, credentials.json)