Skip to content

Commit 6b75096

Browse files
committed
docs: changelog and blog post for v0.8.4
1 parent 92f433c commit 6b75096

4 files changed

Lines changed: 201 additions & 1 deletion

File tree

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
title: "What's new in Hindsight 0.8.4"
3+
description: Multi-LLM failover and round-robin routing, finer recall control, scheduled mental-model refresh, richer token accounting, and a batch of data-integrity fixes in Hindsight 0.8.4
4+
authors: [nicoloboschi]
5+
date: 2026-06-30
6+
hide_table_of_contents: true
7+
tags: [release]
8+
---
9+
10+
Hindsight 0.8.4 builds on [0.8.3](/blog/2026/06/18/version-0-8-3) with a focus on **reliability and control**: keep memory operations running when a provider degrades with **multi-LLM failover and round-robin routing**, tune retrieval more precisely with **finer recall controls**, keep knowledge current with **scheduled mental-model refresh**, and get **more accurate token accounting**. It also lands a batch of **data-integrity and robustness fixes**. **Self-managed deployments should upgrade.**
11+
12+
<!-- truncate -->
13+
14+
<video controls muted loop playsinline width="100%" style={{borderRadius: "8px"}}>
15+
16+
<source src="/img/blog/release084/hindsight-0-8-4-release-notes.mp4" type="video/mp4" />
17+
</video>
18+
19+
- [**Multi-LLM Failover & Routing**](#multi-llm-failover--routing): Survive provider outages with failover and round-robin across LLMs.
20+
- [**Finer Recall Control**](#finer-recall-control): Per-stage scores, two-level score filtering, configurable recency decay, and observation-aware dedup.
21+
- [**Scheduled Mental-Model Refresh**](#scheduled-mental-model-refresh): Keep consolidated knowledge fresh on a cron schedule.
22+
- [**Sharper LLM Control & Accounting**](#sharper-llm-control--accounting): Per-operation temperature, per-scope timeouts, and richer token usage.
23+
- [**Data-Integrity & Robustness Fixes**](#data-integrity--robustness-fixes): Why you should upgrade.
24+
25+
## Multi-LLM Failover & Routing
26+
27+
You can now configure **multiple LLMs and have Hindsight route across them automatically**, with **failover** when one provider errors and **round-robin** to spread load. A single provider outage or rate-limit no longer has to stall memory operations.
28+
29+
Each member of a multi-LLM configuration can be tuned individually — including **LiteLLM Router settings**, **Vertex AI service account keys**, and **per-member Vertex project and region** — so you can mix providers and accounts with the controls each one needs.
30+
31+
This release also adds two new OpenAI-compatible providers, **Requesty** and **Atlas Cloud**, so you can point Hindsight at them directly.
32+
33+
## Finer Recall Control
34+
35+
Several changes give you more precise control over what recall returns and how it's ranked:
36+
37+
- **Per-stage scores and two-level filtering.** Recall now exposes structured per-stage scores, and `min_scores` supports two-level filtering — so you can require minimum relevance at each retrieval stage instead of a single blunt cutoff. (Now correctly threaded through the maintained Python and TypeScript SDK wrappers, too.)
38+
- **Configurable recency decay.** Choose how strongly recency influences ranking — `linear`, `exponential`, or `none`.
39+
- **Observation-aware dedup.** A new `prefer_observations` option drops raw facts that have already been superseded by consolidated observations, so results favor the synthesized view.
40+
- **Exact filtering of global observations.** You can now exactly filter untagged/global observations.
41+
42+
## Scheduled Mental-Model Refresh
43+
44+
Mental models can now be **refreshed on a cron schedule**. Instead of relying solely on activity-triggered refreshes, you can keep a bank's consolidated knowledge current on a cadence you define.
45+
46+
## Sharper LLM Control & Accounting
47+
48+
- **Per-operation temperature.** Set different temperatures per operation for tighter control over response style across retain, recall, and reflect.
49+
- **Per-scope timeouts and retries.** Per-scope LLM timeout and retry policies are now applied consistently across providers.
50+
- **Richer token accounting.** Usage totals now include cached and "thoughts" tokens where supported, and reasoning tokens are tracked for OpenAI-compatible providers — for more accurate cost reporting.
51+
- **More reliable structured output.** Anthropic strict structured output now goes through forced tool use, and reflect's structured-output retries are capped to avoid runaway retry loops.
52+
- **Configurable upload size.** The maximum upload size is now configurable in the control plane.
53+
- **Faster, more scalable bank stats.** Bank statistics now scale to large deployments, and an on-demand refresh lets you force an exact recount when you need it.
54+
- **Full memory details in the explorer.** The control plane's memory explorer now shows the complete details of each memory, making inspection and debugging easier.
55+
56+
## Data-Integrity & Robustness Fixes
57+
58+
The reason to upgrade — a set of fixes that protect what gets stored and keep background work healthy:
59+
60+
- **Append-mode chunking.** Retain append mode now merges JSON arrays so conversation-aware chunking is preserved.
61+
- **Observation search vectors.** Search vectors are now maintained on observation insert/update, keeping search and consolidation correct.
62+
- **Consolidation safety.** Consolidation now keeps items by default when a dedup action is missing, preventing unintended drops.
63+
- **Migration bootstrap.** Migration bootstrap now respects the configured vector extension.
64+
- **Accurate bank stats.** Bank stats cache is correctly invalidated after deletes and clears, so counts stay accurate.
65+
- **Graph maintenance deadlocks.** Concurrent inserts no longer trigger enqueue deadlocks during graph maintenance.
66+
67+
Other notable fixes:
68+
69+
- **Token usage on parse failure.** Provider token usage is now preserved even when tool-call argument parsing or validation fails, so cost reporting stays accurate.
70+
- **Reflect JSON envelopes.** Reflect now unwraps JSON-wrapped answers returned by some models.
71+
- **Cleaner error paths.** List endpoints reject negative `limit`/`offset` with a `422` instead of a server error, async operations return `404` when a bank doesn't exist, and `PATCH bank` / dry-run extract no longer create banks unintentionally.
72+
- **Retain error messages.** Retain error summaries now preserve the underlying exception message for easier debugging.

0 commit comments

Comments
 (0)