Skip to content

Releases: agentevals-dev/agentevals

v0.9.1

15 May 17:14
9c39e64

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.9.0...v0.9.1

v0.9.0

15 May 08:31
b93ab07

Choose a tag to compare

What's Changed

  • BREAKING: consolidate 'metrics' and 'custom_evaluators' into evaluators by @peterj in #149

Full Changelog: v0.8.4...v0.9.0

v0.8.4

14 May 11:42
094f9e8

Choose a tag to compare

What's Changed

Full Changelog: v0.8.3...v0.8.4

v0.8.3

14 May 09:05
4406260

Choose a tag to compare

What's Changed

Full Changelog: v0.8.2...v0.8.3

v0.8.2

11 May 10:20
43bc581

Choose a tag to compare

What's Changed

  • implement plugin model for ResultSink by @peterj in #142

Full Changelog: v0.8.1...v0.8.2

v0.8.1

06 May 14:23
f12a891

Choose a tag to compare

What's Changed

  • Durable storage backend (preview). New opt-in Postgres backend for persisting evaluation runs and their results. The Helm chart now ships an optional bundled Postgres for easy trials. APIs and schema may change without notice while this matures. (#135)

Fixed

  • Wheel publishing to PyPI works again; a packaging defect produced duplicate file entries that PyPI rejected. (#138)
  • PyPI project page now renders the README logos correctly. (#126, thanks @frivas-voiceatlas)

Upgrade notes

  • No action required for existing users. The default in-memory backend is unchanged.
  • 0.8.0 was yanked due to the packaging issue above. Pin to 0.8.1 or later.

Full Changelog: v0.7.3...v0.8.1

v0.7.3

30 Apr 14:56
da3ca5e

Choose a tag to compare

What's Changed

Full Changelog: v0.7.2...v0.7.3

v0.7.2

20 Apr 08:05
a4dc0c8

Choose a tag to compare

What's Changed

Full Changelog: v0.7.1...v0.7.2

v0.7.1

17 Apr 09:53
572321b

Choose a tag to compare

What's Changed

Full Changelog: v0.7.0...v0.7.1

v0.7.0

15 Apr 10:48
520f386

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.6.4...v0.7.0